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ABSTRACT 

This thesis repoHs the i*esults of digita3.1y simulating ouLstar 
eriibedding fic3.d learning networks. An out star is a device that is 
capable of inductively lc?,rning to associate the occui’ance of a 
command event x-rith a pattern of events. Once this association is 
learned, the outstar X'rill reproduce tha pattern vrhenever the coriimand 
event occurs, 

A simple outstar was studied. It vras found that a fast rate for 
forgetting accwaulated experience is necessary to maintain control 
of the amplitxxdes of the outstar* s responses. It was further found 
that a fast rate for forgetting accumulated experience results iii 
poor noise resistance but good adaptability, A slot-r forgetting rate 
results in good noise resistance but poor adaptability. The practical 
aspects of thresholds was studied, 

A laterally inhibiting outstar xjas studied. It was found that 
the active process of lateral inhibition results in both good noise 
resistance and good adaptability, 

A short study of outstar avalanches \7&s made. An outstar avalanche 
is a cascade of outstars xjhich can leain and reproduce time var>>'ing 
patterns. It was found that a command node cascade avalanche does not 
v;ork well because of pifLse lengthening, A "long axon vrith collaterals" 
avalanche was studied, 

A virtual laterally iiihibiting outstar X'ras studied, 

A convenient method for analyzing new fonnulations for the learning 
process in an outstar was developed, • A "generalized" learning process 
was developed and studied. 

The analogy between embedding field theory and the nervoixs system 
of living organisms was introduced. The theoretical proposal that 
learning on the neurophysiological level is due to the production of 
transmitter in a synaptic cleft proportional to the correlation between 
presynaptic and posts 3 /naptic membrane potentials x-ras used to sirapllst- 
ically model a learning process for outstars. 
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CHAPTER 1 



ET-IBEDDIKG FIELD NEIWORKS 



section l.i Introduction 

Grossberg has developed a theory for learning called embedding 
field theory, (Refs. 1 - 10) Ho has proposed several devices designed 
in accordance i-rith this theory to handle broad categories of learning 
phenomena. These devices are inductive learning machines which are 
governed by a set of deterministic equations. He has qualitatively 
demonstrated their learning abilities. He has further dravm an 
analog;/ between embedding field theory and the nervous system of 
living organisms. Based on this analogy, he has made a concrete 
proposal for the neurophysiological phenomena underlying learning 
in living organisms. 

By means of a digital simulation, this thesis experimentally 

studies one embedding field device called an outstar, and it will 

examine a combination of outstars called an outstar aValanche, The 
analogy between the nervous system of living organisms and embedding 
field theory vrill be introduced and examined. 

For the uninitiated, we vnlll begin by deriving the ba.sic concepts 
of mbedding field theory from intuitive ideas about learning. 



6 



0* 



section 1»2 



Ulustrative Derivation of an Embedding Field Notviork 



Eaabedding Field theory is a nathsmatical model for learning. To 
gain an operational appreciation of this model, consider modeling the 
following learning experiment: 

An oxperimentei* teaches a subject an arbitrary time sequential list 
of letters of tho alphabet by saying the list to the subject several 
times. At the end of this instniction , the subjoct is requested to 
repest the list. If ho can, then it is concluded that ho has learned 
the list. 

In order for the subject to leam the list, the letters composing 
the list raust bo familiar to him and must appear to bo separate events. 
One of the tasks of this experiment will bo to teach tho subject to 
combine the separate letters of the alphabet into a new event which is 
the list, VJe expect that after instruction, presentation of the first 
letter of the list will automatically result in tho subject expecting to 
hear tho succeeding lettei’s of this list, 

V/e begin our description by modeling the subject's state boforo tho 
experiment has begun. Ho is familiar with the letters of the alphabet 
and recognizes them as sep>ai’ate events. We modol this by assigning a 
distinct point in space to each letter of the alphabet and calling 
these points nodes. To denote recognition of a letter of tho alphabet, 
A^, we assign a time varying process x^(t) to each node V^, Xj^(t) has 
the properties: 

(a) x^(t)/^ 0 when the letter A^ has not been presented to the 
subject recently, 

(b) x^(t) > 0 V7hen tho letter has been presented to tho subject 
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recently, 



As x^(t) indicates only the two conditions (a) and (b) above, we 
may constrain x^(t) to be non negative. 

V/e raodel the experimenter's ability to comra'unicate with the sub;5«5ct 
similarly. VJhsn the experimenter says the lettei* to the subject, a 
non negative input pulse P.j^(t) is delivered to the appropriate node 
in the subject. The pxilso P^(t) has the properties: 

(c) P^(t) > 0 vihen the cxporiiaenter says A^. 

(d) Pj^(t) = 0 all other times. 

It will require a small, but finite, time inter\fal for the experimen- 
ter to say A^. P^(t) is non sero during this time iivterval. 

Wo are nov7 in a position to ^-rrite a differontial equation for x^(t): 

eqn (1) x^Ct) = -ooc^Ct) + P^Ct) 

Equation (1) ^-ra.s chosen to raodel the response of to presentation 
of the letter A^ because it is the simplest continuous representation for 
x^(t) satisfying conditions (a) and (b) on Xj^(t). 

The experiment is now begun. The experimenter says a list A^^, A^, 

... A^ to the subject. There vrill bo a time intorval, \r^t between tho 
presentation of each letter. Foi’ simplicity, vre assume that these time 
intervals arc all the sarae. 

At the beginning of tho experiment the subject has no idea of what 
the experimenter's list is. Therefore, when A^ is presented tho subject 
can only guess, \rith probability 1/26 of success, vihat the experimenter's 
selection for the second letter is. This carries throughout the list. 
If the experimenter has presented letter A., tho subject can only guess 

U 

with probability 1/26 of success, what tho letter is. 
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Howovor, \;hen the experiuaenter presents the list for the second timoj 
we expect the subject to be able to predict the snccoedirig letters of this 
list with much greater accuracy. When the subject has learned the list, 
he irill be able to pi’edict all the letters in the list, in their correct 
order, inlth certainty. 

V/o rmst now riodol this process. 

Firstly, we have said that the subject lias the ability to predict 
what the succeeding letters of the list are, and this ability becomes 
more successful after each presentation of the list. 

Lot us model this prediction process by connecting each of our nodes 
to every other node i-rith transmission lines which wo shall call edges. 

We allow the signal x^(t) from a node to travel aiiay from that node along 
the edges to each of the other nodes where it can act as input to these 
nodes, Tho actxial prediction of the letter foil coring the letter is 
modeled in the same manner as awareness of a letter being presented by 
the experimenter. The appropriate x.j^(t) process is excited by the pre- 
diction signals arriving via tho edgo from V., 

The subject's ability to only bl‘j.ndly guess what each succeeding 
letter is when tho list is first presented means that eqxxal prediction 
signals aro ioelved at all nodes at the beginning of the exporinent. 

His abij.ity to predict the entire list in the correct order after learning 

means tliat after excitation of the V. node, a prediction s'j^nal is received 

0 

only at the correct V. ^ node. 

Prediction of tho letters of the list in their con'oet oi'der requires 

that the node be excited by px’ediction signals before .the node. 

To accomplish this, x;e constrexin tho prediction signals traveling along 

the edges to a finite transmission velocity. That is, the signal Xj(t) 

J 




Figure 1,2.1. Geometric schematic of nodes and directed edges. 



lO 




* 



oi'iginating at node arrives at node after a' time delay of 






tine vmitsc I 

I 

The situation vje have described so far is pictured in figure 
In figure 1*2*1 \je have dratm the edge c . . as ti^o directed edges, 
and to stress that the lists c^i^d are dis-tinct. The 

arrowhead indicates the direction of transmission along the directed edge* 
Ref ©ring to figtire 1*2*1 on© can easily see hovr the subject predicts 
the succeeding letters of the list after hi© has learned it. If he has 
leaivied the list *.. excitement of vjill result in a signal 

traveling to Vj^, It vrlll arrive at tiin© units later and Xp.(t) will 

be excited and a signal •vrlll be sent to V. and so on. For simplicity, 
we shall assume that all the transmission delays are equal, or "^^4 "tT 
for all i and j. 

The effect of learning on the subject's prediction process is as 
follows: - _ 



(e) Before learning, excitement of the V. node by presentation of 

the A . letter resvilts in equal prediction signals arriving at all nodes 
I 

to which V. is connected by edges T time units after presentation of 
J 

the letter, ' 

(f) After learning, excitement of the V. node by presentation of 

tJ 

the lotto!' A. results in a large prediction signal being delivered to the 

node from r time units after presentation of A^, No prediction 

signals, or at least small prediction signals, are delivered to the other 

nodes connected to V. by edges* 

0 

Now, w© must .develop a mechanism vrhich connects th© subject's 
prediction process from state (o) to state (f) as the list is repeatedly 
presented. 
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To develop this mecharrisra, x;o note that the experimenter is present- 
ing lottors to the subject every time units. If w is too sraall, say ■ 

1 milliseconds the subject vriXL be luiablo to distinguish the separate 
letters of the list and it xdll bo impossible for hiiti to learn the list. 

On the other hand, if xx is too large, say 24 hours, wo expect the subject 
to have lost the context of the experiment. That is, if the experimenter 
said "A" yesterday, and then says "C" today, x-xe would not be surprised 
if the subject responded, "Seo x-riiat?",,. Again, we do not expect the subject 
to leainx the list xxhen vr is too large. In botxxeen these extremes vxo 
expect the subject to do veiy xreiJL, 

V7o noxT analyze this dependence of the subject's learning ability 
on the presentation interval xt. 

If \r is large, say xx , then the process 2 Cj(t) has long ago 

decayed to zei^o before the next letter is presented to the subject and 
Xj..,. becomes largo. Additionally, the prediction signals from have 
long since traveled to the ends of the edges from V., pei'formed their 

d 

prediction excitement of the other nodes, and decayed. As x-x is shortened 
x^e begin to arrive at the situation xxhere tho prediction signal from Vj 
arriving at the other nodes is still largo x-xhen excited by 

presentation of the lettei', VThen w " Y the signal from Vj ai'rivlng 

at the other nodes exactly cori*elatos xrith the process. Making 

x-x smaller yet, such that vx « r , moans that many nodes are large xxhon 
the prediction signals from any one of the excited nodes arrives at any 
other. 



It seems likely that the sxxbject's learning ability is dependent 

on the correlation betxxeon his ixrediction signal arrivMig at the 

node from the V . node and excitement of the 3C pi'ocess by presentation 
3 t ^ I 
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of the letter. Asstmiing that this is the key to the subject's 

learning ability, vre way vjrite do;m some properties for his learning 
mechanism 1 

(g) If in one presentation of tho list the prediction 

signal ari'iving at the node from the node is large at the same 

time that the process is largo, then on subsequent predictions of 

the list a largo prediction signal is delivered to from the V. node, 

(h) If condition (e) is not mot, then on subseqxxent predictions, 

a small prediction signal is delivered to the node. 

In condition (g) and (h) V7e have gotten in some geometrical difficulty. 

Previously \;e had decided that the prediction signal traveling along an 

edge e. . is the x. process from the V. node suitably time delayed to 
aj 1 a 

account for the finite transmission velocity. If this signal is allowed 
to arrive at the V. node unchanged it iTill always bo large qr time units 
after excitement of Yet in condition (g) and (h) we have described 

a process which detemnines tho amplitude of the prediction signal being 
delivered to V. based on the past correlations between the prediction 

ci 

signal and the x.- px*ocess. The difficulty is tliat vro must now require 
Vj to perform two functions: That of keeping track of recent presen- 

tations to, or predictions by, tho subject of the A. letter via the x.- 
process; and that of determining liow vigorously the subject shoiO-d pre- 
dict the A^ letter based on past experience. • The second of these 
functions vras placed at V. because it requires both the prediction 
signal Xj^(t - ^ ) and the x. process be simultaneously available for* 
correlation. 

Reference to figure ic2ci shows that besides V^, the other place 

vrhere x^^Ct - ) and are sisaultanoously available for correlation is 
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■the arji’owhead of the e. . directed edge. In order to maintain one functio 
per dement of figure 1,2,1, » tie shall locate a process, z. in the 
arrowheads of the di’rected edges Trf.th properties (g) and (h). This 
simplifies things cons'iderably, because \to can make this process an 
amplifier of prediction signals vrith the further properties: 

(i) When a. . is largo, a large predict ’ion signal is delivered to V. 
from V^, 



(j) VJhen a . . is small, a sicall prediction sigiial is delivered to V. 

from V. , 
a 

A modification of eqn (1) is nox; in order to account for conditions 
(i) and (j) above: 

eqn (2) “ "C<,Xj(t) -i- Pj(t) t 2 - 'r ) 

Considering conditions (e), (f), (g), and (h) ^:e may foi’mdate an 
equat’ion for a^^^ as a funct’lon of tirae. 

Condition (o) implies that boforo the expordinent begins, 0, 

That is, tho initial conditions on the are: 

a^j(O) 0 

Conditions (f) and (g) impD.y that aj^j(t) gets large only when the 
predicting signal x..(t ~ T ) and process x.(t) are large at the same tirae 

cl 

and that a^^(t) remains large for a long tisie afbertcard. That is: 

Zwt^(t) -v-X;.(t - Y )x.(t) 

Condition (h) implies that trhon x^(t ~ T ) and ai’© not large 

at the came time, then z. .(t) decays toward zero. That is: 

Zij(t)'N^ -UZij(t) 

Combining the above results, we have: 



eqn (3) 

with initial conditions 



= -uz.-.(t) t X.. (t 

Zij(O) 0, 



r )xj(t) 



Equ.aticiis (2) and (3) fii’c sufficient to describe the subject’s 
learning process and its dependeiice on the experimenter’s presentation 



interval V7. If the experimenter presents the letters of the list with 

a time iiitoival betv:oen each letter of approximately time units, then 

when the A. letter is presented, tho prediction sigtial fi^ora the V. 

node has arrived at the ars’oidicad of the e. node and the product 

Xj^jl^('t)xj(t ~ T ) is large. From cqn (3) z^^j..,j^(t) grows. On subse- 

quont repetitions of the list the same conditions are mot and z.. (t) 

grows larger yet. On tho other hand, Xj^(t) for the nodes Vj^, k j + 1 , 

corresponding to letters of the alphabet other than A^.j.^ are small when 

Aj+1 presented and from eqn (3), z^j^(t) decays tot?3,rd zero for k j+1, 

V/hen tho subject is asked to recall the list, he uses his prediction 

process, starting at the first letter Aj|^ and sequentially excites each 

of the nodes corresponding to letters in the list in their correct 

order by f ollovjing the path of largo z^^ ’ s until, tho end of the list ' ' 

is reached. To prevent saddling ourselves vrith a cumbersorao outpat 

mechanism, we assuine that the experimenter can read the a^iplitudes of 

tho Xj(t) processes and considers a large x.(t) as a resix)nGe by the subject, 

3 

One can easily see that \dien w >> , none of the products . 

Xj^(t - T )xj(t) are large and the subject learns nothing. On the other 

hand vrhon v , many nodes, Vj^, are excited before t?ie prediction 

signals from tho node associated with the first letter in the list 

arrive at their corresponding arrowheads. Thus the associated z^j^(t)’s 

grovr large. This situation continues as the prediction signals from 

the subsequent letters of the list arrive at their arros^heads. Called 

upon to repeat the list, the subject's prediction process will equally 

excite many nodes at the same time. To the subject, it vrill appear 

IS 



that ovor-y letter of this list succeeds cveiy other letter. Although 
he has Ijiiited his guesses to the letters in the list, the subject is 
no better off than he x-7as at the beginning of the experiment in being 
able to repeat the list. 



section i»3 



Generalized Embedding Fields 



The crabodding field notirork derived in section 1,2 to model Icai’n- 
ing of an alphabetic list is a specialised example of embedding field 
networks. This particular netTrork x^as derived because it illuat rates 
vividly the major ideas beWaid embedding field theory and its derivation 
depends oifl.y upon intuitive ideas about learning. It is not the only 
embedding field netwoi’k which can learn time sequential lists and it 
may not bo the best network for this purpose. The alert reader may 
have noticed that it can not repeat a list in xrhich a letter is I’epeated, 
In addition to being dependent on the exporimcntei’’ s presentation lnter-> 
val w, its pei“formancc is highly dependent on the tiano delay 't and the 
parameters ck and u in eqns (2) and (3)« It has other pi*obl©ms» but 
remarkably^ Grossborg lias shovm that these problems are qualitatively 
similar to problems experienced by human subjects trying to loam an_ 
alphabetic list. (The interested reader is referod to references 1 
and 3 for a detailed analysis of networks similar to that derived in 
section 1.2.) 

However, the network of section 1.2 contains most of the elements 
of embedding field theory and we shall pause here to list theiii. Figure 
1,3.1 shows the pictoral representation of these elements. 

(1) A node representing an ©Icmontal event which the network 
is capable of recognizing and responding to. 

(2) A directed edge e. . allowing transmission of signals at a 
finite velocity in one direction from node V. to node V.» Pictorally, 

N ^ V 

a directed edgo is draim as an arrow shs.ft with the arrowhead indicating 
the transmission direction. 
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Figure, 1.3.1. Elements of an embedding field netvjork. 

The process x (t) is located at the V. node, 
i i 

The process x.(t) is located at the V. node. 

1 0 . • 

The process z (t) is located at the arrov; head N 

ij ij 

The prediction signal x.(t -T) is arriving at the arrov; head N. •. 
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(3) Ai'rowheads M. . representing the termination of directed 

edge o, . on the node V.. Because the directed edges transBiit signals 
ij 0 

without effecting them, it vrill not be necessary to reference signals 
traveling along a directed- edge until they reach the arrowheadc In 
all subsequent equations in this paper, signals which have been trans- 
mitted ftlong a directed edge v;ill be identified by the effect of the 
transmission delay on them, ito* x..^^(t - T )• 

(4) Input pulsos Pj^(t) to node indica.ting the occurance of the 
elemental event represented by in the environinent external to the 
network. Input pulses will always be non negative and identical3.y 
zero except in a small time intou/al around the occm-'ance of event i. 

It is assuiaed throughout this paper that P^(t) is immediately available 
at whenever event i occurs, Becs-use embedding field theory does not 
deal with the input apparatus necessary to deliver inputs to nodes, 

no geometric symbol has been developed for this purpose, 

(5) A process ^CjCt) located at vrith the general formulation; 

1.3.1 x.(t) = -a(t) -i- 2 b.(t - r ) t P.(t) 

0 i ^ J 

The amplitvido of x.(t) ijidlcates whether the event represented by 

3 

V . has recently been observo'd or predicted by the net-siork, 

3 

The term a(t) is designed such that x.(t) alvjays returns to some 
ambient state indicative of no recent occurance or prediction of event 

The term - T ) is the effect of prediction signals on V^, 

The summation is taken over every arx'owhoad N , impinging on V., 

ij 3 

b (t - T ) is the modified prediction signal received by V, from the 
arx-owheads N impinging on it. 



We viill most frequently i\se the follotmig foriiixilations for these 

functions t 

a(t) " o(.x.(t) 

'J 



3^(t ) = /3a. .(t)x.(t "'2^ ) 

1 I 11 1 



a-J 

With these f omulations , equation iss 

x^(t) = -aXj(t) 

(6) A prediction signal modification process z^^(t) located in 

the N. . arrowhead vjith the general foimilations 

1,3.2 z. .(t) = -u(t) -i- f(xj(t -r ), x.(t)) 
ij ^0 

The z .(t)'s are the memory of tho network. In general z.. .(t) will 

correlate prediction signals signals x.(t - ‘X ) vrith the process x^(t) 

via function f , and deliver a suitably modified prediction signal 

b^(t - ) to V^, The amplitude of z^^(t) is the network's memory of 

how well X. (t *■ T ) and x.(t) have correlated in the past, Tho term 
^ J 

u(t) is the network's "forgetfvilnoss", V/o vjill most frequently ur.b 
the follovdng formulations for these fametions: 
u(t) = -uz^^(t) 

f(xi(t “T ), Xj(t)) = vx^(t “ r )Xj(t) 

With these formulations, equation 1.3*2 is: 

Z. .(t) = “UZ. ,(t) + vx.(t) x.(t ~ 't' ) 

3-J ij 1 3 

Combining tho geometric elements of figvire l,3«i in various \-jays 
and suitably defining the terms of eqns 1»3»1 ^nd lo3«2, Grossberg 
has dovelopsd networks which qualitatively model many general categorie 
of learning phenomena. In addition to describing learning phenomena 
on the psychological, level as in section 1,2, Grossberg has draim an 
analogy betireen embedding field nettjorks and nerve networks in living 



organisras which is a concrete theoretical proposal for tho neurophys- 

20 



iological phonosisna vmd«<i*ly3j^g 3.ear5iir!g in living organisms c (See 
references 2 and ) 

The power of embedding field theory is that it is a gonex'alised 
theoiy describing learaing vrith deteministic eqxiationso The eqiiation 
are simple enovigh to allow mathematical analyses and the establishment 
of the conditions necessary for them to perform the tasks desired of 
thcsiie Duo to the large nmber of nodes and arrowheads necessary to 
model a parbiculai' leaming phenomena, exact analytic descriptions 
of their performance are difficrilt. However, the basic simplicity of 
the eq'uations makes the siraulation of their perforiaanco straight- 
forward on a high speed computer. 



CRAPTER 2 THE OUTSTAR /uND THE OUTSTAR AVALANCHE ©JBEDDING FIELD 
NETVroRKS 

section 2c 1 Description of the Netviorks 

The cr.ibcdding field network of section 1.2 \m.s derived to illustrate 
the concepts of embedding field theory. Combining the elements of his 
theory in another 'sray, Gj’ossberg has proposed ttro very intorestj-ng net- 
works which this paper vriJ.1 s'tudy. The out star network, and a combin- 
ation of outstars called an o\xtstar avalanche, are netwoiics capable 
of learning and reproducing any number of complicated space-time 
patterns. 

Figure 2,1,1 presents the geometi'ic schematic for an outstar and 
the basic equations governing its perfomance. The N grid nodes V^, 

,».V^ represent the set of elemental events the netwoi-lc is capable of 
recognizing. Each of the distinct combinations of elemental events 
taken singly or several at a tiine is a distinct pattern. 

The command node represents an event which always precedes a 
particular' pattern of grid elemental events. The function of the outstar 
is to learn to associate the occui*-ance of the event associated rclth the 
command node causally with the occuranc© of the grid pattoim. After 
learning this "causal" association, the occurance of the ca/Btiand node 
event will result in the associated pattern oocui'ing on the grid - even 
thoxjgh there are no external inputs to the grid. 

As an illustration, the outstar may bo used to raodel a pianist 

playing a piano from a score. Excitement of the x process at the 

c 

command node represents the event of reading the notes associated rn.th 
a chord on his score. The grid nodes represent his fingers and a large 




GRID NODES 



EQUATIONS GOVERiNEJG NETi.ORK PEFtP^ORI-LANGE 

2.1.1 Xp(t) = -ooCpCt) + 

2.1.2 X;j^(t) = - ctx^(t) + ^ z^^(t)x^(t-r ) + 

2.1.3 z (t) = -uz .(t) + vx (t -t')x.(t) 

cl Cl c 1 

Figure 2.1.1. An outstar neti\'ork and the equations 
governing its performance. 
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ST/JITING NODE 




EQUATIONS GOVIKNING NETv.OHK PKHFORNANCE 

2.1J+ ^ 

2.1.5 X 4 ( 1 ) = - c'vX .(t) + Ax . ,(t - f) for l<i-M 

Cl Cl I ci-1 

2.1.6 x^(t) = -oi.x.(t) + A S z" 4 4 (t)x .(t - T ) + P.(t) 

J 3 ( i~i ci,0 Cl 0 

for 1 - j - N 

2.1.7 z . (t) = - uz (t) + vx .(t )x (t) 

ci,0 ci,j Cl j 



Figure 2,1«2^ An outst^r aval'lnche and the equations 
governing its performance. 



x^(t) is interpreted as the ith finger being lowered to strike a piano 

key. A sriall x.(t) represents the jth finger being raised so as not 

to strike a koye By prvactice the piano playor vn'iJLl leara the proper* 

finger positions associated vrlth the \rritten chord in the rmsical 

score. The ontstar VTill learn the proper finger positions by reading 

the chord on the score and having its fingers placed in the proper 

positions sufficiently often. This finger pattern 1011 bo reaierabered by 

large and sjiiall z_-(t)‘s at the appropriate arrowheads impinging on 
cx 

the grid nodes. After having learned the association between the written 

chord on the score, both the pianist's and the outstar's fingers will 

automatically assume the proper position when the chord is read. 

Figure 2,1,2 presents the geometric schematic of an out star ava- 

lanche and the basic equations governing its behavior. An out star 

avalancho is a cascaded series of outstars. Each outstar learns and 

is capable of reproducing the pattem on the grid approximately 

time units af-ter its comBiand node is excited. The command nodes are 

deteiministically cascaded® That isj excitation of the s'fcarting node 

by an input loll alirays result iii a prediction sigjial going to V ^2 

which irill send ono to V ^ and so on. There is no learning associated 

with this. The command node cascade is an embedding field clock. 

Because the prediction signals travel along directed edges at constant 

velocities, excitement of the starting node results in a prediction 

signal arriving at eorrmand node V . , (i - l)'i' time units later. If 

a time vaiying pattern of elemental events is being played on the grid, 

then each command nodo takes a picture of that pattern irhen it is 

excited. Thus associating the start of a jiarticulai* time varying 

pattern, say a piano sonata, with excitermont of the starting node will 
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result in a time sequsntic.1 series of pictures appi’oxmating that 
patteni being learned by the network. If many coraiiand nodes aro 
cascaded in this manner and T is made sufficiently small, the sampled 
data approximation of the pattern can bo made arbitrarily close to 
the pattern. 



section 2t2 Theoi'etic?,! Work on Out stars and Outstar Avalanches 



Grossberg has PiathcBiatica3J.y analyzed the pattern leai'ning abilities 



of outstars and outstar avalanches exbensivelyo (Refs 7 and 8) . In 

the process of this analj’^sis ho developed particu3.arly handy mathoma«* 

tical descriptions of a pattem of elenental events, the pattei*n learned 

by the o’atstar to approximate this pattei'n, and the patterii ropi*oduced 

by the outstar on its grid v7ho'n predicting, the elcraental event pattern* 

An elemental event pattern is defined by the vaI.T\cs of the input 

ptilses grid nodes. Although their amplitudes may be 

different t all input pulses have the same shape. V/e can describe the 

relation of the ith input pulse to the other N - 1 inputs (consisting 

o£ non zero pulses P.(t) indicating that event J is part of the 

3 

pattern and zero pulses Pj,('t) indicating tliat event k is not part of 
the pattern) by forming the probability j 



The elemental pattern can bo completely described by the N dimensional 
vector, 



Note that this description of the jvattern is amplitude indepsn-* 
dent. That is, © defines the pattern whether that pattern is pro- 
sented vigorously or not. Additionally note that by the defiiiition of 
the 0-^ , 6 not only describes a pattern by the occurance or non 
occ\irance of elemental events in it, but also by the relative strength 
of the occurance of those events. In the piano playing example, this 
corresponds to describing the finger positions for a chord by ijidicating 




Pi(t) 

t Kit) 



vrhen any P.(t) comprisiiig tho pattorn 
is non zero 
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which finger's e.i'e raised so as not to strike keys and which fingers 

are lo-;ered to strike keys, pins the relative pressin’e each of the lowered 

fingers is to exert on the keys. 

Since tho P^(t)'s have the same shape, and differ only in ainplitvido, 

the O.y^s are constants during presentation of tho pattern# 

In a siwilar manner the outstax*s* response to presentation or 

prediction of this pattern can be described by the probability vector: 

X(t) = X^(t), XgCt) X^(t) 

x.(t) 

where X.(t) - — iq 

S x.(t) 

jn ^ 

The pattern learned by the out star to approximate this pattern c-an be 
described by the probability vector: • 
x(t) ^y^^Ct), y 2 (t), y^(t) 

z .(t) 

where y.(t) “- 7 ^ 

j--l 

Now suppose that the pattern 0 has been presented to the outstar 

M times# Then Grossberg has proved that starting vrith arbitrary initial 

data for the x.(t)*s and k .(t)'s: 
a ci 

(a) For eveiy M = 1, the lird.ts: 

(M) (t^), , 

Q = t~>oo X. (t) 

1 1 



and 



(A\) 

R. - t->co 



Yi (t) 



exist. 



(b) For eveiy M = 1 and for all times t after the last, prosenta- 

tion of the pattern, tho probabilities X^ (t) and y^ (t) are monotonic 

in opposite §enses with [ y (t) - X- (t)l non inci’easing and are constant 

i 

on intervals where tho prediction signal from V is zero# 

c 
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(c) 11m 11m (A^) 

M->oo n, = M->cio M = 0 

a i i 

■whei’e 

ra^= Biinirayjii of or 

and Iq is the instant the last presentation of the pattea'n iras 
complotcde 

And 

(fA) ItA), . (M), . 

M. = maxiini'cn of X. (tp.) or y. (.t_) 
i 1 u *• 1 0 

Thus by (a) -(c), 

liia lira lira lira 

M-»co t^<^ X^ (t) = K-^oo t-^oo (t) ~ 

(d) The famctions y.(t), (t) - X^^^(t) and X^^\t)-0. 

change sign at most once and not at all if f^*^^ (t=0)g^*^^(t=0)— 0» 

Moreover, f!^^ (t=0)g^,^^ (t=^0)> 0 implies f!^^(t)g^^\t)> 0 for all t - 0, 
i i i 1 

Interpreting these resxflts, vre sec that (c) implies that the 
netvrork’s memoiy of the pattern and its predictions of the pattern 
converge to the pattern as the number of times the pattern is 
presented increases, or "practice raakes pei’fect". (a) and (b) insures 
that the network's raeiaoiy of and prediction of the pattern after the 
last presentation of the pattern vrU.1 get no worse than it ^ras immedi- 
ately after that last presentation, (d) shows that there is at most 
one oscilation in the convergence and therefore the network's learning 
ability is stable. 

An additional benefit of result (c) is that if the netvrork started 
associating one pattern "vrith the comraand node event and it is decided 
that association is an error, then a new pattern, the correct one, 
raay be learned over the old one %d.th sufficient px’actice. That is , 
all errors are correctable, 
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section 2,3 Approach to the Study 

Grossberg's theoretical resists greatly enhance the attractiveness 
of outstars and avalanches as devices for modeling certain categories 
of learning phonomonae As qualitative models they have vride application, 
(See refs 6 and 9) Hov:ever, beyond the qualitative insight that they 
provide, are they practical? The mathematics guarantee that an avalanche 
vfill learn a piano sonata uith sufficient practice. If sufficient 
practice means forty years, v;e ■would do vrsll to go shopping for anotticr 
model - not vecause they do not t/ork, but because they do not viork woll 
enough. 

Thus the question ''Hoij ^^oll do they work?" is pertinent. This 
is the question that this paper addresses. It is a practical question 
and outstars and avalanches are considered as practical devices tha.t 
learn throughout the rest of this paper. 

In order to accomplish this study, a digital simuD.ation of the 
networks was programed onto a computer. The details of this sinulation 
and an evaluation of its accviracy are provided in appendix A, All 
attempts were made to reduce the artificialities and errors introduced 
by this method of study, Hov'fover, constraints wer'O forced on the study 
by the digital simulation and these constraints vriuLl be noted and ex- 
plained as they occur in this paper. 

As an outstar avalanche is a cascade of outstai's, the priyiary 
emphasis of this study is on outstars. In studying the outstars, 
attention is devoted to the possible interactions of one outstar 3n an 
avalanche i-rlth another, Whore avalanches are presented, they ai^e more 
or loss used as tests to confirm the conclusions established vjhile 
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CHATTER 3 THE SITIPLE OUTSTAR 

section JA Specification of Parameters for the Study 

The geometric schematic and equations in figure 2«1»1 describe 
the simplest outstar. The equations are repeated here for easy 
references 

3.1.1 X (t) = (t) i- P (t) 

V C 

3.1.2 x^(t) = -ooc^Ct) + P^(t) 'i -(3 z^,(t)x^(t - r ) 

3.1.3 - “uz^^(t) + vx.j^(t)x^(t -r ) 

In oi'der to study this outstar, vie must assign ntimbers to the constants 

, u, V, and T { initial conditions must be assigned to the variables 
Xq, x^, and Zq^{ a shape and amplitude for the inputs P^ and P^^ must be 
selected} and the numbers of pattern nodes, N, must bo specified. 
Additionally, the test patter-n to be taught to the outstar must be 
decided upon, _ , 

A great deal of experimental time can be saved if these parameters 
are specified in a somewhat rational ■way, A ' rationale can be developed 
for any method of specifying tho parameters, so \jq sliall arbitrarily 
begin vrith the inputs. 

Firstly, the inputs are only used to indicate the occurance of 

elemental events external to the ovitstar. All we require of them is 

that they be non negative in an inten^al around the oecuranco of the 

elemental event and zero at all other times. Also, \<iq VTOuld like them 

to reflect the sti'ength of presentation of the events they represent. 

For a first try we vrill make them identical in shape, duration and 

amplitxide for both tho grid inputs P. (t) and the command input P (t), 

X c 
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An impulse raight be good shape for them, but there might be effects 
associated vrith duration that uould be interestisig to coo. On the 
other hand, if we want to analytically check oxir results, then we want 
the inputs' chapo to be simple enough to make the analysis tractable. 

A rectangular* pulse of amplitude A and dviration 8 is suitable. Note 
that xd.th this selection for inputs wo have iiaplied that our input 
apparatus is a digital sampling device which samples the continvious 
variation of events in the external envirorunent at time t^, sets the 
inputs to nodes corresponding to events present in the om^lronment 
at t^ to valxio A, and holds these values tmtil the next staple is takon 
at time t^^ 8 < If we recall that an avalanche performs a similar 

digital approximation to time varying events, this selection for inputs 
is not too bad. 

As the direct response to the inputs is linear, may leave the 

amplitude, A, of tho input pulsos arbitrary. In selection of the duration 

6 , we run into a coriipromise with the digital simulation. An accurate 

simulation of the response to a long duration pulse requires considei*ablo 

computation time. Thus to minimize computation time, should be shoit. 

Yet the pvilses were given a finite duration to study possible effects 

of dviration, V/e do not want S to be too short. With this trade off in 

mind, a good selection for B vrould be the shortest rise tiiae in the 

outstar. The rise tames of the out star are l/a for tho x processes 

at the nodes, and l/u for the z processes at the arrowheads, u is the 

"forgetting rate" of the outstar and it would bo expected that the 

forgetting rate of the outstar should be slower than the response 

ra.te, CL , of the x processes. Therefore it is reasonable that cl should 

be greater than u. This implies that l/cx. is the shor^test rise time 
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in the ontstai* and w© shal3. set S = l/oC • 

The X processes at the comand. node and the grid nodes indica.te j 
the recent presentation to or prediction by the out star of events. At 
the beginning of a learning oxi>eriment it is reasonable to ass’OJic tkat 
there has been no recent presentations or predictions of tho events to 
bo learned. The initial conditions for tho x processes can bs assumed 
zero, i.e. x„(0) == 0 = x.(0) for all i. 

The response time a of the x processes has already been specified 
as oi.=l/(5 . Thus all the parameters for the coriimand nodes x^ 

process have been spedified. For the grid nodes T , y3 , and tho initial 
conditions on the z's still must be spec'ji“iedc To save coraputation 
time, T should be small. As there is no feedback from the grid to the 
command node, there is no necessity for tT to be non zero in this simple 
out star. In a digital simulation, ho^jever, the accuracy is improved 
if there is a time delay between simultaneous processes and making 
0 is advantageous, A suitable selection for T is 't' = S . 

From equation 3»le2 it can bo seen that /3 and z^^(t) determine 
the amplitude of tho prediction signal being adiaitted to grid node 
V., As the out star’s memory is the z„^(t)*s, it is the most important 
factor in this prediction signal amplitude determination. Sotting 
/3 = 1 vxill make analyzing the effect of the z's on the prediction 
signals easier, 

-The p3.rameters associated with the z processes, u, v, and initial 

conditions z .(O), must be specified, u is the ''forgetting rate" of 
cx 

the outstar. As we want tho outstar to remember xrhat it has learned, 
we want u to* be small. Remembering that eomputat’ion time is scaj^ce, 
a small u for this experiment is anything such that the decay tmo 
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l/u of the z processes is several times longer than the leng'th of the 
oxpex’iment • 

Selecting v is a problem* As can be seen from equation 3*l»3t 
V determines the rise i’ate and amplitude of the z pi'ocess given an 
x.(t) response and the prediction signal x,(t -Z" ), In presenting 
a pattern to the out star to be leaivied, tho best leaming should occur 
when tho inputs to tho grid nodes are presented at the same time as 
the prediction signals from the command node arrives at the airrowheads. 
The problem is that in this situation, how well should tho outstar loam 
tho patteinx on the first presentation? To answer this question, we 
need some way of measuring how well tho outstar has learned a pattem 
after presentation, 

A tentative operational measurement would be to say that the outstar 
has learned a pattern well when tlie prediction process drives the amp- 
litudes of the grid node x processes to at least tho samo valvies as 
they are driven to by tho event inputs. Using this measurement we 
can specify v' s which result in well learning in one presentation or 
two presentations and so on. 

However, this does not end the problem associated with "rationally" 
selecting an initial v for an experiment. Suppose we specify a v 
which results in well learning in one presentation. What value should 
this V have? A rational selection of ah initial v requires solving 
the outstar equations. The reason vjhy the outstar is being simulated 
is the difficulty of analytically solving tlieso equations. To avoid 
these diJ'f iculties , the procedure taken in this study xras to specify 
all other pax’ardoters in the outstar including the nuimbei’s of presenta- 
tions required for well leai’ning, A guess is then made for a v and an 
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cicpsriment is porforaed to soo vrhat cwplitudc the prediction process will 
drive the gi’id nodes to after ono pattern presentat: ■)n8 The guessed 
V is then appropriately scaled to result in the specified well learning 
critei'ioa* 

For the c"or-rent experinent, v vias selected to result in woU 
learning in two pattern presentations, 

Concorning the initial conditions for the z processes, wo expect 
on the first presentation of the pattern that the netvrork has not 
previously learned anything about the pattern. That is z_.(0) = 0 
for all i. However, v;e wo vild like to see v;hat happens if one of the 
z^^*s is not zero at the beginning of tho experiment. Therefore wo 
will make one of the z .(O) non zero, but sraall, 

OIL 

Only tho nraber, N, of grid nodes and tho test pattern to be 
taught the outstar remain to be specified. As we are only performing 
this experimont as an initial look at an outstar, a good test patter-n 
V70uld be presentations of one event which the outstar should Ioann 
to e.ssooiate nrith the coiimand event. An additional event presented 
at a time well removed from ai’rival of prediction signals from tho 
command node would be a good T?ay to test interference betr-jeen outstars 
in an avalanche. As v vras selected to result in well learning in tvxo 
presentations, this test pattern Tdl.l be presented tn-rice and then a 
prediction will be called for to see hovr vjell tho pattern has been 
learned. 

This gives us two grid nodes, A third grid node is imcludcd to 
study the effects of tho non zero init'jxil conditioned z processes. 

No inputs vrill be given to this grid node. 






36 



We need novr to only assign ntaibers to the parameters in accordance 
with the above specifications s 
Geometric parameters: 

N " nxsraber of grid nodes “ 3 

't - time delay of prediction signal == Oo3 sec. 

Input parameters: 

Input pulse shape is rectangular 
A = input pulse amplitude =10 
^ = input pxTlse dur*ation = 0.3 sec. 

Input pulses vrill be delivered to the command node, V , at times: 
0,1 sec,, l,9seCe, and 3»7 sec. 

No input pulses v?ill bo delivered to grid node 

Input pulses will be delivered to grid node V 2 at times: 0,4 sec,, 

and 2,2 sec. 

Input pulses vrlll be delivered to grid node at times: 1,0 sec, 

and 2,8 sec. 

Network parameters? 

= time cons"tant of x process = 3*3333 sec,“^ 

/3 = prediction signal amplification constant =1,0 
u = "forgott;lng rate" = 0.01 sec.”^ 

V = correlation aiaplification constant =1,6 (satisfies well 
learning in two presentations criteria) 



Initial conditions: 

X ( 0 ) = x.(0) = 0 for all i 
c a 

z ,(0) = 0,1 
cl 



Z1 




T 



1 



The c.bove lengtliy description of the reasons for seloetion of tho 
parameters for the experiment to bo presented in the next section t:as 
provided as an illnstration of tho decisions that must bo made when 
performiiig tho experiments in this stiidy. Except vhei'e noted 5 in tho 
future the saiie reasoning iiill underlie the selection of parameters 
for experiments. 
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section 3«2 Expsi'isient I - A Look at a Sitiiplo Oatstar 

Figm’o 3*2. 1 shovrs the rosults of the experiment outlined in 
section' 3el 'i'he inputs to the nodes are plotted on the same trace as 
tho X process response of the nodes, 

A striking fcatm’e of figtire 3<>2cl is that the x process node res- 
ponses all have maplitudes of significantly less than the araplitndes 
of the input pulses. It can be seen that this is as it should be 
if vxe consider the equation goveraing the response of a node to an 
input onlyj 

x.(t) = -dx.Ct) t P (t) 

1 1 i 

The solution of this equation for a rectang\ilar input pulse of 
amplitvido A and duration £ is: 

(a/cx)(1 - ) for 0 ^ t - ^ 

(A/cc) (1 - )e~*^ for t ^ S 

The maximum of this response occurs at t = 5 * For tho pai'ametors 
specified for this experiment ^ the maximum amplitude of an x^(t) response 
to an input piuLso only is: 
max x^(t; =1,9 

which is about 2C^ of the amp3.it ude of the input pulses. 

The po-ttem we intended to teach to the outstar was to associate 
the occurance of the command evont vfith event 2* The outstar \m.s in- 
structed in this pa.ttern tvrlce by presenting tho command evont to it 
and then presenting event 2 to it T time units later. This can be 
seen from the command input trace and grid node iaiput trace, 

Aftoi' the instruction vras over, the coimnand event above was presented 
to see if tho outstar had learned tho pattern. As can be seen from 



x^(t) 






3 ? 



Figure 3 •2,1. The results of experiir.ent I - an initial look 
at a simple outstar. 
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V^'s thiixi response, the out star did predict event 2 and \i70 can consider 
that it has learned the patterno 

This ©xperiiJient \tas also designed to see what effect a small non 
zero initial condition on a z vroxild have. Thus z^^ was given the initial 
value of 0»1 while z^^ aiid were given zero initial values. As 
can be seen from the Xj^ response res-ce, the small non zero initial 
value for z perceptible effect, x^^(t) did respond to 

the prediction signals, but the response was so small that it doos 
not show on the scale shosen for figure 3»2ol* 

We gave the input pulses a finite duration to see if there would 
be any effects associated t-rith this duration. Such a duration effect 
is the fact that the x responses reach a maximum at the end of the input 
pulses and then decay exponentiidly away from this maximum. This effect 
is entirely duo to the shape selected for the input pulses and the 
exponential response of the x processes. If we accept the sampled - 
data input appajfatus desci'ibed in section 3ol ‘''^be input apparatus 
for the outstar, then this effect has important consequences. It says 
that the outstar' s response to a sariple taken at tiaie t^ extends, vjith 
large amplitude, into the next sampling period starting at t^ t 8 
and beyond. In this cxperanient, we selected the inputs to to occu.r 

2 8 after the inputs to V^. As explained above, the inputs to 

v:ere selected to result in maximum learning,. From the trace for x^(t) 
it can be seen that event 3 was also loamed to be associated i-nlth the 
command event, although to a much lesser ext.ent. This resul.ted from the 
"tail" of the prediction signal still being reasonably large whet) event 

3 occured. The product x^(t)x (t ~ "T ) was therefore sufficient to 
cause z to grow as can be seen from z (t)'s trace. Thus vjhen the 
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out star V7as tested to see what it had learned, it predicted event 3 
as vrell as event 2. 

Thus the "tail" duration effect irill result in the out star learning 

not only what happens in the sample in which prediciton signals arrive 

from the coimnand node, but also in the sample taken after that. By 

symmeti’y, it will, learxi the samples taken before in the same way. 

We will nark this effect for further study. 

Another effect to note in figure 3»2.1 is that 2 _(t) grew with 

each presentation of the pattern and on the recall test. Because u 

was chosen srtiall, z _(t) did not decrease and essentially acted as 

c2 

an integrator of vx_(t)x (t - t' ), The effect of the grov/ing z (t) 

^ c 

can be seen in the trace for x^(t) where the x response increases in 
amplitude on each presentation or prediction. If this groirth continues, 



vre could expect x^ responses to get impractically large. Experiment I 

was continued and the x (t) responses did continue their growth, 

2 

Figure 3»2,2 shows this continuation and it can be seen from the trace 
for x^Ct) that the x^ responses continued to grow on predictions only. 

Not only are the x^ responses groTxrdng id.th each prediction, but a quick 
look at the z (t) trace will show that they are growing at an increasing 
rate. 



Experiment I was continued not only to study the groTjth of X 2 
responses but also to tost the theoretical prediction that outstars 
are capable of correcting all mistakes. An attempt vras made to correct 
two types of mistakes in the continuations. It was decided to consider 
the already learned associations between the command event and event 2 
as a mistake and tha.t the correct association shoiiLd bo \jith event 3» 
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Figure 3,2.2. 

Continuation of experinent I, 
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Therefore, event 3 presented x time \inlts after presentation of 
the command event three timest Event Z ms not presented at all. 

The second type of mistake 'sra.s simvilation of a "random" mistake 
by presenting event i once tiriie units after presentation of the 
command event. 

The resiilts are interesting. Dae to their growth, responses 

continued to be greater than and x^ responses. The x^ responses were 

catching up vri.th the x responses, but from the z (t) and z ^(t) 

2 ■ - c3 

traces, it can be seen that it vnll require, many presentations of the 
association before x^ responses ifill reach a point where we 
cotild say that the mistake is corrected. 

From x^'s trace it can be seen that the "random" mistake was re- 
membered by the outstar. It was also predicted with increasing ampli- 
tude on subsequent predictions. However, with the results of experiment.'! 
plotted as they are 5n figm*o 3»2,2 it is diffioxflt to see if any 
mistakes vrere coi-rected. The theoretical prediction that all mistakes 
could be corrected involved the convergence of the probabilities 
and y.j^(t) to 6^, Ti*anslating the data from figure 3*2.2 to those 
probabilities, V7e have the follo^-Jing results s 

Table 3.2.1 

Translation of data from figure 3*2.2 to probabi3.ities suitable 
for comparison to theoi’etical pi’edic5.ton that’ an outstar can correct 



all mistakes 






1 



Table 3.2,1 



Response nurabor, M 


0 


1 


2 


3 




0 


0,5 


.0 


0 




0 


0.188 


O.O 83 


0.097 


^1 


0.07 


0,083 


0.091 


0.101 


' ^2 


i.O 


0 


0 


0 




0,892 


0.563 


0.0625 


0c612 


^2 


0.886 


0.75 


0.682 


0.632 


03 


0 


0.5 


. 1.0 


1.0 


V 3 X^ 


' 0,107 


0.2^J'9 


0,292 


0.319 


^3 


0,107 


0.167 


0,227 


0,265 . 



The M = 0 response column is the results from the last response 
in figure 3c2,l and is the Ijiitiol data that tho continuation of experiment 
I began id.th» The M ~ 1 I'esponsc column begins the attempt to correct 
the mistake ^c’~^^3 includes tho "random" mistake of 

presenting event 1. Tho M = 2 and M = 3 response columns are tho con- 
tinuing effort to correct V — >-V to V without "random" mistakes. 

c 2 <2 3 

Except wh€;n the random mis'take occured, and remain sr.’iall 
and about tho sajue magnitude as tho duration effect "error" of event 3 
in the first part of experiment I. We conclude that a "random" mistake 
affects tho memory of the outstar to a small extent. 



Table 3«2,1 doos show that the is being coi*rocted 

to V V as X and y^ are decreasing vrhilo X^ and y^ are increasing, 
o 3 2 3 3 

However, from the number s we can conclude that it •prill req\iire many 

presentations of tho V V pattern before the magnitudes of X„ and 

c 3 . 3 

y^ exceed X^ and y^ and many more presentations of V -> V before 
j c c, c 3 

^<-5 



X and y„ bear the sobig re3,ation to and y* as and y^ had to X 
and y^ an the K = 0 rosponse® In the ueantimOf it conld be expected 
that the x response vrill have become unrealistically largo. 



The xmcontrolable grox-rth of the x responses wakes this outstax’ 
an unatti^aotivG device, Althoixgh it conforms to the theoretical 
predictions 5 the actua.l means by vzhich x^e measure its performance is 
the X response and not the X probabilities. The grovdng x responses 
means that in our piano playing example, this outstar vrill be pvDnching 
holes through the Icoyboax'd of the piano x-rlth its fingers uhen it plays 
a frequently used chord. Thus, to make this a useful device, xre must 
find some means of limiting the x responses at a practical amplitude. 

As ue pointed out, the groxrth of the x responses was dvie to the groxrth 
of the process which deteminos the amplitude of prediction responses. 
Wo had chosen the "foi'gctt^Jig rate” u of the processes to be small. 

At the sajiio time we did so, it scesned reasonable to have the outstar 
forget slowly, Howevei’, non decaying z processes have lead us to an 

C jL 

undesirable situation, VJe xrtll therefore try to control the ajnplitu.de 
of the X response by increasing the "forgetting rate". 
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section 3»3 



A Siraplo Outstar izith a "Fast" Forgetting Rate 



The forgetting rate of experiiaont I vias selected to be "slow" I 

relative to the tarie scale of experment I, In experiment I, the 
chai'acteristic decay time, l/u, for the s . process \ra,s TOO seconds 
vrhich vjas long compared to the li seconds total length of the experiment. 
In that li seconds, the network x-rss asked to learn one pattern and then 
to correct it. The time between presentation and/or predictions was 
1,8 seconds. Thus, when vre speak of a "fast", forgetting rate, we must 
decide "fast relative to vfhat?". 

To conserve corapittation time, wo shall make the forgetting rate 

fast relative to the presentation and/oi* prediction tijiie integral, 

- 1 ' 

i,e, l/u " 1,8 seconds, or u = 0,55^ sec, , This leads us into anothei* 

pi’oblem. The v of experiment I was selected on the "two presentations 

mean well learning" criteria. That is, the z process \Jox3ld got large 

ci 

enough in tv7o presentations of the pattern so that a prediction follovi- 
ing these presentations viould drive the auiplitudes of the x processes 
to the same values as the input pulses alone would drive them. If 
we expect the network to forget in time comparable to the presentation 
inteinral, it vjovild be better to change v such that it confojmcd to a 
"one presentation raeans well Icarnljng" criteria. We vrill therefore 
double V to V = 3*2, 

To compare the fast forgetting rate outstar to the sIoi'T forgetting 
rate outstar of expei’iraent I, V7e shall re^porfom the first pai*t of 
experiment I i-Tith all other parameters specified as they are in sectioii 
3*1 « This cxperiTilent vdll be called experiment II, 
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Figure 3.3.1. The results of experment II - a simple outstar 
with a fast forgetting rate. 
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Figure 3e3«l shows the resTilts of this expsri went. Because 
the responses of this expsr;uaent trore smaller than in experiment I, 
the vertical scale for the x traces \'7as doubled in figtire 

As can bo seciip vro have managed to reasonably control the 
amplitudes of the x responses by allc^-Ting the a's to decay beti:ecn 
excitements of the command nodoo At least the z’s do not exhibit the 
monotonic growth they did in experiment I, The intended association 



was learned ixell. Again there is some learning of V V„ 
c jJ “ c j 

duo to the "tails" of the prediction signal. The non zero initial 
condition on z^^ produced no perceptible effect. It can be concluded 
that the outstar performs veiy well over short periods of time. Hovrever, 
with its memory decaying rapidly, how long vrlll its memory persist? 

This question hits upon one of the key featm’es of an outstar. 

The mathematical theorem concerning outstars states that the outstar' s 
memory of a pattern remains \udraixiired foi* all time after the last 
presentation of the pattern, provided no new or random pattern is pre- 
sented to it subsequently. Of course, in the 3.anguage of the theorem, 
this meant that the y. *s \;ould not change even though the z ,’s were 

1 Cl 

decaying exponentially. It looks like a fast forgetting oxxtstar has 

the opposite problem from tho slowly forgetting one. That is, the 

responses, while retaining tho px'opor x^^ probabilities to define the 

pattern, are so minute that they are meaningless mcasui’ed against a 

practical scale. However, the third response on the z (t) and z «(t) 

c2 c3 

traces chows that a prediction vjill cause the z processes to gx'ow. 

Now, suppose that tho z processes have all decayed to the point 
\rhere a prediction by the outstar results in meaninglessly small grid 
X i*esponses. Then if enough predictions are made rapidly enough, vre 
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can "p'ui'ip up" the a's to the point whore tho x responses are largo 
enoxigh to mean soraothing. Gi'ossborg's theorem ins-oi'es that the ampli- ^ 
tudo of tho X process vril3. remain in the proper ratios to one another. 
Experiment II was continued to dcraonstrate this laemory "piiraping xip" 
and the rosxilts are shorai in figure 3®3»2, As can be seen, the outstar's 
memory vras allowed to decay for awliiJ.© and then tho command ©vent x-ras 
presented to the outstar three times in i*apid succession, "Pumping 
up" occured as expected, 

A psychological interpretation of memory pumping up \rould not be 
tenuous. It is an every day occu/'ance to have a piece of previously 
learned information, a name say, on tho "tip of ono's tongue", but 
not bo able to roeallit vmtil all tho associations connected to it 
have been reealledo If wo consider the name to be inscribed on the 
grid on an outstar, then recalling things associated \ri.th tho name 
would be equivalent to rapid excitements of the command node. After 
enough sixch oxciteaients, the name X70uld appear to "pop into ono|s 
head". Tho memory of the name vjould then bo "fresh" for sometime after 
being resui’rected before it again faded into tho "preconscious" , 

We vjill introduce a modification in section 3«5 xjhich vrill make the idea 
of a faded memory "poppijig" into tho outstar’s "head" more precise. 

Of coui-se, a presentation of the pattern after the outstar's z 
processes have decayed to small values will also refresh its memory. 
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section 3*^ Resistaneo to Plandora Mistakes vsc Correction of 
Learned Mistakes! A PhjJLosophy for Leai*ning in 
Out stars 

Experijiicnt II i7as continued to investigate the effects of a 
simulated random mistake on a simple outstar with a fast forgetting 

rate. The results are sho";?!! in figure 3,^,1, Event 1 was presented 

✓ 

at the same time as event Z to simulate the occurance of a random 

mistake in the pattern. As can be seen from the and traces, 

the random mistake complbtely confiisod the outstar. Whereas tho outstar 

had previously learaed the association V^— occurance of tho random 

mistake resulted in the outstar remembering V and to only a 

c 2 

slightly lesser e>±ent, V V . , The amplitude of the second and 

c 1 

third prediction responses in figure ai’c significant enough 

to conclude that the random event resulted in confusion. The memory _ 
of a simple outstar with a fast forgetting rate has veiy little 
resistance to random mistakes. 

To understand the significance of this outstar' s low resistance 
to random mistakes, we must develop an understanding of the outstar* s 
relationship to its external environment. Up to now, wc have just 
been concerned with the interval workings of the outstar, Novi con- 
sider that tho outstar is a machine vrhich includes the outstar network 
previously described p3.us an inpxit apparatus. This machine "lives" 
in an environment in vihich events occur. The input appa.ratus filters 
the events occuring in the envlronsiont and delivers an input pu3.so 
to the appropriate node in tho outstar when one of tho events tho 
outstar is capable of recognizing occurs. The outstar is capable 
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Fifure 3.4.1. Continuation of oxfcriiiaont I from Fif^rc 3.3.2. 
P^Ct) simulates a random rristake in the pattern previously 
taught to the outstar. 
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of learning the association between the co:amand event and any events 
vihich are ropresenteJ by grid nodos if they occur approxinatoly 't 
time units after occurance of the coimiand events 

In order for the out star's beaming ability to conforra vrith in- 
tuitive notions about learning, uq would v;ant it to learn that the command 
event is associated with a partic\tlar pattei'n if and only if the occur- 
ance of the command event in the environment is visually followed by 
the occui’ance of the patterns Suppose the outstar observed one bowl- 
ing ball colliding v.’ith another with tlie I'osult that the first ball 
stopped dead and the second bowling ball rolled e.i«xy from the collision 
point with tho same velocity that the first bowling ball had before 
the collision. After the first obser\’'ation of this event, w© would 
expect the intelligent outstar to suspect that it had observed, a law 
of naturo tJia.t applied to all bowling ball collisions, V/e irould expect 
the outstar to go from a state of ignorance about the conservation of 
raomentum to an intuitive vindorstanding of it. Philosophically, we desire 
tho outstar to be an inductive learning i/iachine. 

If we described this situation statistically, we may assign 
probabilities to the occurance of events in the onvironriient. At any 
given time, t, we nay describe the likelihood of the occurance of an 
event associated with the node in the outstar by the probability 
PRj^, Additionally, we can describe the relationship between tho oc- 
curance of events with the conditional probability which is the 

probability of the occurance of event j given that event k oecured 
recently. In the outstar we are particularly concerned vjith the prob- 
abilities Inhere c is the command ovent and the i are tho grid 

©vents. To make the outstar an inductive learning machino, we want 

5H- 



it to learn only if is large* If is f;mall 

vre vrould vrant the out star definitely not to learn V -^V., 

c J 

On the first occuranoo of a pattern following tho cor.i3iand event 
by approximately t time units* the ovitstar can havo no idea of how 
largo is* Therefore mb would irant it to only suspect that the 

command event usually proceeds this pattern. However, if the next 
time the consijand event occurs, it is followed by the pattor-n, then 
there js good evidence that is large and the outstar should 

draw this conclusion. Now, in the real world, wo expect background 
noise. That is, jf“ event j does not usually follow tho occurance of 
event c, there is nevertheless a small probability that it i-rill occur 
as a random mistake sometime. In oi'der to protect the outstar* s 
memory, vie vrould vrant it to be resistant to dravring spurious conclu- 
sions about tho association of the cornmand event vrith randosily occuring 
raistakos. If tho outstar observed the collision of bovfLii-ng balls 



in vfhich one of the balls vras shattered into many pieces, vre vrould not 
want this i'andom occurance to destroy its confidence. in the conser- 
vation of momentum. 

The memory of a pattern in an outstar is contained iji the 
probabilities s 

The equation describing the z's is: 

Zf,i(t) = -uz (t) •!- vx.(t)x (t - r ) 

Ox ca ^ c 

In the case whero u is vory' small, this is equivalent tos 

Vi(^) == V ]■ ^i^f - r )<jP 

-00 ' ' 
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where x^(t) is the response of a grid node to one input pulse in the 
infinite tirae periods ^ ,(t - T ) is the prediction signal from 
the corisnand node 'tT tii?ie units before the grid event. Thus, if in all 
time predeeding t, the coirmiand event has been presented to the outstar 
M times, 

Thvis, if PR./, is large corresponding to a causal association between 
1 / c 

event c and i in the environment, z (t) vrill be large. On the other 

C J. 

hand, a sinall PRj/c corresponding to event j occuring randomly and not 
causally associated with event c in the environment, z (t) will be 
sraaDle Thus the z's can be considered random variables faitlifuUy 
reflecting the a priori conditional probabilities in tho environment. 
Note that this reflection of the statistical description of the 
enyii^onmcnt is contained in tho amplitudes of the z*s and is built 
up by experience vrith M presentations of the pattern. The resistance 
of the simple outstar with a slow forgetting rate in experiment I 
to random mistakes was due to this correspondence between the amplitudes 
of the z's and the a priori probabilities in the environment. It may 
be concluded that whereas the outstar' s memory of a pattern is contained 
in tho y^(t)'s, its memory of its exi^erience is contained in the ampli- 
tudes of tho z processes. Thus, when its memoiy of its past experience 
is allowed to be forgotten at a fast rate as in experiment II, the 
occuranco of a random mistake has disastrous consequences for its 
memory of the pattern. 

It is not surprising that a Machine V7hich forgets its past ex- 



perience rapidly vrill bo very susceptible to having its mind cloanged 
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We may look at this as both a benefit and a drawbacke In the slow 
forgetting outstar of experiment I, tho attempt to change its mind 
about a previously loained pattern by teacliing it a new one vra,s only 
partially successfule It required only two presentations of tho 
original pattern for the ovitstar to learn it» However, the evidence 
of the attempt to correct this pattern indicated that many more pre- 
sentations of tho correcting pattern t.'ould be required to change its 
mind, Tho outstar’ s resistance to random mistakes v;as laudable, but 
its relative inability to clis-iige \jith changing times could bo a serious 
drawback in its environment , On the other hand, tho fast forgetting 
outstar vrill have no trouble elianging its mind with the tines, but 
its low resistance to random mistakes is also a serious drawback, 

V/o may sunmiarise the above heuristic discussion of the constant 
u in an outstar? 

(a) A small u implies: 

(i) tiast experience is slowly forgotten 

(ii) high rcsistnaco to random mistakes 

(iii) low correctability of previously learned mistakes. 

(b) A large u iiispliess 

(i) past experience is rapidly forgotten 

(ii) low resistance to random mistakes 

(iii) high eorrectabi'dity of previously learned mistakes. 

In addition, \re must consider one further effect of the constant u 
on the perfoiTr-iance of an outstar: 

(c) A small vi results in tmcontroled growth of the grid x pro- 
cesses' amplitudes. 

Again it is stressed that "large" and "small" u’s refer to whethei* 
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the charr.c'bei'istic decay tine i/vx is long or short relative to the 
expected tmo Intei’val between presentations to and/or predictions 
by the outstar. 

Because of coixdition (c) above 5 a practical simple outstar re- 
quires a large u. Thus design improvements to the fast forgetting 
oxxtstar which resxilts in greater resistance to random mistakes are 
desii’eablec In the next several chapters we shall introduce more 
complicated outstars which exhibit improved noise resistance 
without the x process amplitude problems of the sjimple outstar. 

However f for the present, vre sti.ll have an avenue open for increasing 
tho simple outstar* s noise resistance. 

Part of the reason for the poor noise resistance of tho simple 
outstar in experiment II vras duo to the fact that v was solected 
by the "one presentation means v;ell learning" criteria. Thus presen- 
tation of a random raistake once resulted in its boijig woll learned. 

Had xro selected a smaller v and required liioro presentations of tho 
pattern in I’apid succession to rovSxfLt in xroll learning, then the effect 
o.f the random mistake xroxild be smaller. At the same tirao tho 
correctability of previously learned mistakes woxild decrease , If xje 
vxish to make tho noise resistance of the outstar very good by this 
method, then xto must be content vrith an outstar of slow ixitolligence 
that requires having a pattern dinsimed into its head before it learns 
it; or, xjo coxild use the pxxmping up phonomena of the outstar and have 
it thjjik about a pattern presented to it many times in rapid succession 
before it is well learned. Selection of the p'ropei' v to bo used in an 
outstar is a design decision which must take into account this trade 
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section 3«5 The Occurance of a Pattern of Events over a Period 
of Time; Thresholds 

In experiments I and 11, the grid node events in a pattern were 

aliTays presented exactly X time miits after the command event. 

The reason for this is that it takes T time units for the x response 

to the comjaand input pulse to travel along the directed edges to the 

arrowheads topinging on the grid nodes. Until the prediction signal 

X (t - T ) arrives at the arrowheads there can be no correlation be- 
c 

tween x (t ~ T ) and the x process response to a grid input pulse at 
c 

the adjacent grid node. Thus no ].earning can occur until the 
prediction signal begins to arrive at the arrowheads. However, vre 
have seen indications that learning does occur with grid events 
presented at tiraes other then T time xinits after presentation of the 
command event. In this section we shall examine this phenomena, 
but first v;e must develop a notion that vrill make discussion of this 
phenomena easier. If we are going to study how vrell an outstar learns 
associations between the command event and grid events which may occur 
more than or 3.oss than T time xmits after presentation of the command 
event, we will need a method of describing when these events occur. 
Measuring the occurance of grid events relative to the occurance of 
the coiiBiand event is not a very good idea. No learning can occur 
vuitil the prediction signal has arrived at the arrowheads. The 
ti'ansrnission time dolay 't' ±s s. rather arbitrary time iirberval vrhich 
may be changed from outstar to outstar, 

Howevei’, once the pi’ediction signal begins to arrive at the arrov;- 
heads, the o\itstai» will begiii to learn tho pattern on the grid nodes 
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Figure 3 •5*1. The upper traces shov. the input pulse used ^ the 
resulting prediction signal^ the response of a grid node to an 
event of = 0 presentation phasc^ and the response of the Zci(t) 
process associated v.'ith that node, 'The bottom curve shows the 
phase-correlation curve and the irreducible phase-correlation curve* 
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iiidepondent of hovr long it took tho prediction signal to travel from 

the command node. Th\is a good reference point for describing tho I 

occurance of grid events is the time instant vrhen the prediction signal 

begins to arrive at the arrowheads. Wc shall denote this instant in 

time as (f> ~ 0 and let cf) be tho time measured relative to = 0 at 

which grid events are presented. Grid events which occur before 

== 0 vfill be said to occur at negative values of and grid events 

vrhich occur after ” 0 vrUl be said to occur at positive valties 

of (p . shalD. be called the phase of an event with respect to the 

prediction signal ^ or simply the presentation phase. To be precise, 

<P wU.1 be defined as follows. Let t be the time instant at vjhich 
^ P 

the prediction signal begins to arrive at tho arrowheads. Let t^ bo 
the time instant at which a grid node 'jnput pulse begins to be non 
zero. Then is: 

= t - t 
.. ® P 

The follovfing experiment was perfomed, A practical si^iplo outstar 
vrith a fast forgetting rate and many grid nodes vra.s sot up. The constant 
V was selected to result vroll learning in one presentation of a 
grid event with ~ 0 presentation phase. Then each of tho grid 
nodes vrere excited with events presented with various presentation 
phases, Tho z processes wore all given zero initial conditions. The 
maximum amplitude of the z pr’ocosses attained during the experiment 
vras plotted against the presentation phase • Lacking any better 
name foi' a curve showiaig tho variation of z process amplititdos wj.th 
the presentation phase, tho curve shall ai'bitraridy be called a 
"phase-correlation" cuinre, A phase-correlation curv'e is shoxv’n at the 



bottom of figui’o 3»5»i 
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Figfuro 3»5»1 shoxrs a variety of things besides a "phasc- 

corrolatj.on" cni'vo. The top trace iai figure 3«5ei shows the shape 

and diraensions of the input pnlse used in the experiment. The 

X (t - T ) ti*aco shoxa^s vrhat the prediction signal looked like as it 
c 

arrived at the arrowheads. The first response of the x^(t) trace shears 

what the x process response 3.ooked like for a grid node excited by 

an event presented xjith " 0 presentation phase. The second 

response on the trace shows what a- prediction response for this 

grid looks lilce. The z .(t) tmeo shows what the z (t) process 

Cl ci ^ 

in the arrowhead ijnpingijng on the above grid looked like. The 

irreducible phase"Correlation curve shovm is related to tho phase- 

correlation curve and will be exi:>lained shoi’tly, 

Tho additional infoiroation shoxm in figure 3»5«^- is provided as 

a pictoral look at the various processes going on in an outstar. 

This information was gathered from a nujiibcr of experimonts and will 

bo compared to the results of the next section in vrhich vto study the 

effects of using other input pulses in an outstar. Thus tho actxml 

numerical values for’ the amplitudes of the processes shovm are some- 

vrhat meaningless. To allovi compai'isons to be raade^ the data in figure 

3, 5*1 was plotted as functions of various netvrork pai'araeters. 

In the preceding experitaents vre have follovred tho conventj.on 

in assigning values to the x process rise rate « and the input pulse 

duration S of setting S = l/o<- , The time interval S = l/cx 

describes two iraportant time inter\'’als in the netvrork: The input 

pulse duration, and the rise time of the x processes. Since this 

study is liviited to input pulses of duration 8 and since vro have 

assigned a such tha.t l/w. = 5 throxighout , a natural soleetion 
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for a time xmit among tho experimental parameters is S = i/a , 

Tho time axes 3n figure 3«5»i thus in teiins of 8 == l/oc , 

Since the time constant associated vrith the z processes is the decay 
time i/u» this period is shoTm on the z traceso 

The analytical solvition for a x process responding to a rectan- 
g\ilar input pulse presented at tiiao t “ tQ of amplitxide A and dtu'ation 



iss 



. J 



f(A/a )(1 - e"‘^''^ ) for t f t i t^ 8 



x(t) = 



'0 

(A/c<. )(1 .. for to -i-8 f t 



ly] 



vjhere the nots.tion 13 is def jjied bys 
y for y > 0 
0 for y - 0 

Note that this solution is valid independent of the numerical values 
assigned to Aj ot , and 6 as long asS=i/o( , 

Thus the amplitudes for the x processes are altmys proportional 
to kf (X and this combination of experimental parameters vias used as 
the amplitude axes for the x processes shovm in figiu’o 3»5«1» 

The equation for the z processes is nonlinear and an analytical 
solution vras not found in this stiidy. A combination of e>q3srimental 
parameters xras sought to scale tho amplitude axes for the z process 
traces and the phase-correlation curves c It was desired that a plot 
of a z process agains^t this scale factor vrould be the same for all 
exi^erimonts oveyi though the nmerical values of the parameters in tho 
exjyeriments were dii'f orent o At tho beginning of the experrimental 

r\ 

study> tho paraaieter ecmblnation v(A/ 0(. 5 seemed to vrork vjell and 

was therefore adopted, Howsvei' later e:q>orinents showed that this 
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scale factor did not work viell. Nevertheless, it tjas retained to allow 
comparisons. With this explanation of the scales for the axes of the 
plots in figure » we may proceed with a discussion of the phase- 

correlation curve. 

The phase-correlation curve in figtire 3e5»l shows the maximum 

imerease in amplitude of a z process due to the correlation betv:een the 

prediction signal and a grid nodo x process excited by an event 

presented \-rith presentation pliase <p . As can bo seen, the maxixaum 

increase in amplitude for a z process occurs when a grid node is 

excited by an event vrith <p == 0 presentation phase. Events presented 

with f r 0 indicating that they vrere presented before or after the 

arrival of the prediction signal at the arrowheads result in a lesser 

increase in z process amplitude. For lcpi> 35 3/oc , there is no 

appreciable increase in z process araplitude. 

The effect of the phenomena revealed by the phase-correlation 

curve may be interpreted in a number of ways. Suppose tha.t a command 

event is presented to the out star at time t , Suppose further that a 

collection of gird events, 1, 2, M, usually accompany the occurance 

of the command event in the environment, Hov;ever, suppose that those 

grid events do not all occur at the same time. Let each one occvir 

at time t^, tj,j. The prediction signal generated by the 

command event ijill arrive at the arrowheads at time t <■ X » The 

c 

phase-corrolation curve tells us that the outstar will learn to soriie 
extent that, all the grid events vrhich occur at tiraes t. such thats 

1 (t + r ) - 1. 1 < 38 = 3/o( 

C X 

are associated with the commarid event® Mote that (t T ) - t. is 

c ^ 

the presentation phase <p for the ith event® The phase-correlation 



curve tells us further that those events vjliich occixr at toBies t . 

J 

such that; 

l(t -i-r ) - t 1 < 0,5 = l/20t 

^ j 

v:ill be learned to bo associated vrith the coiasiand event very well. 

One interpretation of this infonaation is that we now have a 
means by which we intelligently ca.n specify X in an outstar. We have 
said nothing about \;hen a cciamand evexit occurs relative to a pattern 
of grid events. In every day experience vre are confronted vrith situ~ 
ations in which the occurance of a "coriffiiand" event results in the 
occurance of a "pattern" of events. The tjme delay between occurance 
of the command event of svritching an electric light stri.tch resulted 
almost jjisnediately in the pattern of the electric lights in a room 
going on, V/e also learned that the coiniaand event of pvitting a seed 
in the groxmd resulted days later in the "pattern" of a plant sprouting. 
In designing an outstar functioning in a "real" cnviroriinent , specif i-~ 
cation of X should be made according to the average time delay between 
occurance of coirimand events and the associated patterns tliat the outstar 
is capable of learning. The phase-correlation curve tolls us vdiat the 
standard deviation of this time delay can be and still result in the 
outstar being able to learn. 

On the other hand, the phenomena shoi-m by the phaso~correlation 
curve is a source for crix>rs iii an outstar avalanche. Suppose that 
the comiuand nodes in an av'alanche command node cascade are so arranged 
that the time interval between exciteiuent of the V command node and 

cj 

the V . , . cammand node xs X • This means that the avalanche takes 
ca‘>-l c 

"pictures" of the time vai’ying pattern of grid events every "X ^ tir.ie 

c 

units to make a sampled data approximation of the pattern. From the 
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phaso”Corrolation crui'\re of figure 3e5«^- we can see that if 'C' ^ is loss 
than 38 " Jilci. % the pictviro taken by the outstarc in the avalanche 
vrill overlap one another. That is, the , outstar tdll learn to 

r » Cj«V. 

some extent the same pattern of events that the V oxvtstcii* learns. 

In particular, suppose that tho pattern of events is varying rapidly 

enough that the pattoi*n of grid events at time t + S is significantly 

different from that at time t. To got an accurate sampled data 

approximation in this sittxation, the aTOlanche vjoull.d have to take 

a "picture" every 8 time units and we would set X ~ S , However, 

c 

the pliase^correlation curve shows us that in this case the V ... 

cjil 

outstar irill learn not only the patter-n of events on the grid when 
its prediction signal arrives at tho arrowheads, but also the pattern 
of events that u’as on the grid when tho prediction signal from tho 
V outstar arrived at tho arroviheads. In this sitiiation, the 

cj 

avalanche’s sampled data approDdmation vdll bo seriously in error, - 
The phenoraena shovrn by the phase-correlation cur/’o in figure 
3,5«1 is due to two things, Firs't, tho input pulses used in the ex- 
periment were rectangular and of duration S , Suppose that the equiition 
for the X processes %'Tas such that the x processes e:cact3.y reproduced 
the input pulse. That iss 
x(t) = P(t) 

Then the prediction signal, and the x processes* responses tiould be 
rectangular in shape and of duration S , The s process correlates 
the prediction signal vdth tho grid node x process. Thus v;o could 
expect the z process amplitude inoi'ease due to a correlation to be 
proportional to the correlation between the rectangular prediction 
signal and tho rcctajigular grid node x process. If the grid node is 

ee 



excited by an event \?hich occurs vrith presentation phase <p ’.rith 
I'espect to the arrival of the prediction signal, vre gets 




for (^ > 0 
for <^< 0 



r rt>] ‘ for cp > 0 

s(t) " I r. 1 + 

[ L® *i- <pJ for cj) < 0 

This is just the correlation bettreen two rectang\olar pulses of 
duration S whose lead.ing edges are separated t^ne by . This 
function is shovm in figure 3c5el ss "the "irreducible pliase-corralation" 
cur-ve. This curve is called irreducible because it shows what the 
p^iase-correlation c^^r•VG would look like if the x processes exact3.y 



reproduced the input pulse# 

As vje have seen, the x processes do not exactly roporduce the 
input pulses# This is because embedding field network nodes are low 
pass filters# VJe have seen, and our ana,lytical solution shows, that 
the X processes* response decays exponentially at-ra,y from the tiaximm 
value it obtained during the presentation of the input pulse# This 
exponentially decasring portion of an x process response vrill be called 
a "tail". These tails account for the difference between the irre- 
ducible phase-correlation curve and the pliase-correlation curve. 

Because of the tails, events presented with presentation pliase such 
that <p < < 0 st oil. have non zero amplitudes to correlate with the 
prediction signal when it arrives at the arrowlieads. Prediction signals 
also ha.ve tails vjhich correlate Tilth grid node x process responses to 

events presented with presentation phase cp » 0, As co.n be seen, 
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this effect begjjis to become important for events pi'osented with 
presentation phase < 3 5 . 

In an avalanche with a fast saraplung rato, modii'ications of the 
component outstars that result in a phase-correlation cuinre which more 
closely resembles the irredvicible phase-correlation curve are impor- 
tant. One modification would be to iiicrease the x process rise rate 
C< . Making <X very lai'ge \>rill restilt in x process response that will 
veiy closely follow the shape of the input pulses. Thxxs the phase- 
correlation curve should be very close to tlie irreducible pliase- 
correlation curve. 

However, increasing <X is not alxiays possible. In this s'txidy, 
increasing CX either resulted in intolerable errors or extremely 
lengthy ccraputer lauis to perform an expreiment. Appendix A explains 
the error-computation time trade off in selection of (X for the 
digital simvilations of this study. . 

If ot. can not be increased enough to E?akc the phase-correlation 
cmrve sufficiently close to the irreducible phase-cori’elation curve, 
there are other methods which will accomplish this. Grossborg has 
proposed the use of thresholds. The equations for a simple outstai’ 
vrith thresholds areJ 



3.5.1 X (t) = -ax (t) -}■ P^(t) 



c* n 




vrhere : 
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PRESENTATION PHASE (TERMS OF 8) 



Figure 3.5*2. Illustration of the effects of thresholds on a simple 
outstar. Equivalent thresholds are placed on both the command node and 
the grid nodes. Note how close the phase-correlation curve is to the 



irreducible phase-correlation curve. 



is tlio coamnd node threshold e As can be seenf it px*events 

c 

the prediction signal x (t - T ) froEi exciting a grid x process 

c 

vintil X (t - T ) > y . Additionally f it prevents the prediction signal 
c c 

froBi being cori'elatcd with the grid nodes' x processes x-mtil x (t - T ) 

c 

is suprathreshold. The grid node thresholds perfoms the sarae 

functions In effect » these thresholds will cut off the "tails” of 

the X processes and thus should result in a phase-correlation curve 

which closely roserables the irreducible phase-correlation cxxrve, 

Figvire 3»5»2 shows the results of an exporiment conducted with 

an out star loth thresholds o The comand node threshold T used in 

c 

this experiment t-jas selected to make the time in.torval during which 

the prediction signal is suprathreshold approximately S tiriie units 

in duration as can be seen from the x (t) trace* The grid node threshold 

c 

y \-jas selected to be the srarie.T = T* * As can be seen, the 
X X c 

phase-correlation curve very closely approximates the ii'reduciblo 

phase-correlation curve. Using thresholds, we could make an avalanche 

vjhich co'old accurately samjpLe a time var.ying pattern every 2 S time 

vtnits. Without thresholds, the shortest the accurate sampling interval 

coxild be is about 6 6 time units as shown an figure Th'us the 

addition of thresholds has increased the accvirate ssiapling rate for 

an avalanche by a f enter of three* 

Hownver, this possible increase in the acexmate sampling rate for 

an avalanche has not been obtained vrlthoxxt a cost. The traces 

and the z (t) traces in figure 3»5c.2 are foi' a grid node excited by 
c2 

an event xiith presentation phaso = 0,5 S , Looking closely at the 

x^(t) trace, one can see that in the first response, P^(t) drove 

x^(t) above thi’eshold and thus z (t) grew* However, on the second 
* ”2 
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INCREASE IN Z PROCESS 
AlPLrrUDE 



0.3(|-fs 



PHASE-CORREUTION 

CURVE 

IRREDUCIBLE PHASE- 
CORREIATION CURVE 




PRESENTATION PHASE cp (IN TEM-S OE'’ 8 ) 



FIGURE 3o5*3» Effect on tho phase-correlation curve of a 
threshold placed on the coinmand node only. 
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I'esponsGf the esciteaient of x (t) vas insufficient to drive it 

2 

suprathroshold and thus z (t) continaed its exponential decay, 

c2 

Lacking the ability to drive suprathroshold* the outstar can 

not ''putnp up” tlie z (t) process and v?o must conclude that the liiemory 
is bound for extinction. In the same v?ay, if the z^^(t) is 
alloued to decay fuHher, prediction excitement of J^j^(t) vnll also 
bo unable to drive Xj (t) suprathreshold and all laeBiory of tho pattei’n 
vjould bo bound for cxt-inction, Iii the shuplo outstar without thresholds, 
we saw that no natter how much the z processes decayed, wo could still 
recover the infomation stored in them by "pumping up". Thus, althov\gh 
a mestiory could fade due to forgetting, it could not be absolutely 
forgotten. An outstar mth grid node tlirosholdc can absolutely forget 
a pattern it has lee-rned. 

To prevent a memoiy from being absolutely forgotten, vro must set 
T = 0, This ms done and a series of oxperijnents iToro porforaied 
to determijae the phase-correlation curve. Figure 3c5o3 shows the results. 
The only x process "tail: that ms cu.t off by a threshold ms the pro- 
diction signal's. Thus, the phaso-correlation curve -for <p>0 is very 
close to the irreducible phase-correlation curve. This is because 
events wi.th presentation phase 0 occroi’ aft,er the prediction signal 
has arrived at tho arrowheads. Cutting the prediction signal's tail 
off prevents it from cori'olating vrith x process responses to events 
presented \rlth presentation phases greater than the time intorval 
during which the prediction signal is si\prathrcshold. In this case, 
this meant no correlation vrith x processes responding to events 
presented t-rlth presentation phase <p>S , On tho other hand, tho x 
processes retained their "tails" because T ~ 0, Thus the "tails" 

X 
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of gl’id nodo responses to events occm’lng before the prediction signal 
arrived at tho arrowheads ( <^^<0) wore available for correlation® 

This explains vrhy the phase-correlation ciu've for <^<0 in figiire 3«5»3 
is simiilar to the phase-correlation cuin,'’c for an ontster vrithont 
thresholds® 

In addition to malcing the phase-correlation curve for an outstar 
closer to the ii*reducible phase-correlation curve, thi'esholds may bo 
used for an interprative purpose. Since the prediction signal 
|x (t - T ) - r 1 could not effect the grid nodes untnl x (t - 't' ) 
was suprathreshold , we coul.d follow the convention of saying that 
an X process at a node does not indicate a response by that node until 
it is suprathreshold® Wo could still set = 0 in equation 3«5«3 
and place an imaginai'y threshold on the grid nodes® With this inter- 
pratative convention, we have a concrete relationship between the 
amplitudes of the x processes and the psychological idea of a response 
from a subject® Additionally, tho phenomena of a faded memory 
popping up into tho outstar' s consciousness during "pumping up" 
is given a concrete interpretation . 



section 3*^ 



Other Input Pul.se Shapes 



A short study uas raado of the effects on a siraple outstar of using 
input pulses vrith shapes other t?ian rectangulai't The restilts vrere 
that there appear to be no qun.litative differences in the performance 
of an outstar xising any input pulse of duration less than or equal to 
l/oL » The sole exception to this qualitative finding was that the 
choice of input pulses does affect the shape of the phaso-correlation 
curves 

Quantitatively p the input pulse did affect the maximuBi amplitude 
of the X responses. Additionally , the magnitude of v to moot a specific 
well learning criteria was affected. 

One important resvlt of this stvidy x*7s.s that the maximum arap3.itude 
of a grid node x process respondii^g to a prediction signal alono xras 
at approximately l/o. time units after anlval of the prediction signal 
for all input pulses. If xfe consider the input apparatxxs of the outstar 
to be a data sampD.cr which samples the environment at time t^^ and de- 
livers appi’opriate input pulses to the outstar' s nodes, then this 
effect can be considex'ed to be an inherent time do3.ay in the outstar* s 
prediction. That is, an event vrhich oceux's in the environment at time 
tp is predicted by the outstar to occur at time t^ + l/cL . 

Figxires 3*6s2, and 3*6*3 shox^ the results foi’ the pulses 

used, ijx this study. They shoxld be compai’ed to figuu’e 3*5«T vjhich shovjs 
similar resxlts foi' a rectangulai' pulse. The iri’educiblo phase-correlation 
cxirves in these figures xxere computed by analytically coi'relating the 
input pxlses. 
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Figure 3.6.1. The response 



of an outstar to a triangular input pulse. 
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Figure* 3.6,2, The response of an outstar to an exponential input pulse. 
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Figure 3.6.3. Th*i re.-.Tonf'r of *)n ontrtnr to 'in ir.|nlre nnrut puDso, 
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CHAPTER h 



lATEPJLL INHIBITION 



section 4*1 Introduction to Lateral Inhibition 



Tho last chapter showed thc!.t a practical outstar ( ono vjlth a fast 
forgetting rate) had the major drawback of either being a slow 
learner or having very low resistance to randora mistakes. This v&s 
duo to its inability to additively sum its past experience in the z 
processes because of the large decay rate. In this chapter vre vrill 
study a more cornplicated outstar vjhich retains all the desirable 
qu?.lities of the simple outstar vrith a fast forgetting rate and has 
the further property that it is resistant to random mistakes. 

The additive summing of past ejq^erience in the slowly forgetting 



outstar of chapter 3 resulted in good resis'tanco to random raistakes 
because this outstar' s experience vri-th the' correct pattoini was so great 
that it could absorb mistakes. The opposite of this passive absorption 
of mistakes vjould be to use the past experience to actively supress a 
mistake vrhen it occurs. Tho psychological tern for active suppression 
•js inhibition. Figure 4,1,1 shows the geometric schematic and the 
equations for a laterally inhibitliig outstar, Tho cqtiations govexning 
its performance ai^e here repeated for convenience; 

4.1.1 X (t) = -ftx (t) P (t) 

c c c 

4.1.2 x.(t) = -ax.(t) P.(t) 'I' (3x (t -r )z .(t) - 

X ^ D- ‘ C CX 

4 . 1.3 z .(t) = "Uz (t) *:■ V [x (t ^ T )x (t)1 *' 

ci ci ^ c a. 

Tho notation [ y] ‘ means the maximum of tho vai-jxiblc y, or 0, as 



in tVie case vrith thresholds, A sliort discussion of the significant 
differences between equations 4,1 and those for a simple outstar fol'icT-rs 





EQUATIONS GOVERNING EE TIVORK PERFORMANCE 



^./•/ -ocXc(t) + f^(i) 



^.l.z 

^.!3 



N 

Xiiil^-oiXi^d) f P.(i) -hpZciCDX^i-r) -E'M^[Xjit-rv] 



J; , 



ZcM)-- -a Zci (i) -h vE Tc a - r)y^ ctj] 



i- 



Figuro 4.1.1. An ouistar vjith lateral inhibition. The double 
lined directed edges transmit inhibitory signals. Only three 
grid nodes are shovm (N ~ 3), 
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A negative prediction cignal 




added to the eqviation for the grid nodes' x pi’oeessos. This is the not 
inhibitory signal sent to grid node i from all the othor nodes in the 
grid. These iiihibitory signals are sent along the double lined 
directed edges iji figure The transmission delay from the 



sends an inhibitory signal on3.y if its x pr-ocess is positive* With 
inhibition, it is possible for an x process to have negative amplitudes* 
V/e shall adhere to the convention of considering that a node is 
responding only if its x process is positive. Although ue will bo 
able to measure the negative excui’sions of the x processes thoy slmll 
be considered equivalent to zero amplitudes in the simple outstar. 

In the siinple outstar, zoro or small amplitudes vrere intei*proted 
as no response* In the laterally inhibiting outstar, negative amp- 
litixdes mean that node is in an iiihibitcd state* Using the above 
convention for interpret i):ig the response of a node implies that an 
inhibited node is in a super non responding state* Limiting a node's 
ability to affect other nodes via the inhibitory signals to only 
those times when its x process is positive is consistent with the above 
convention* 

No learning occurs in the an-owhc»ads of tho inliibitory directed 
edges. The z process ixi those arrowheads can bo considered to always 
have a value of xinity. 

Equation ^!-,le3 foi' the z processes located in the arrowheads 
of the directed edges frora the command node is the same as that for 
a simpT.e outstar* Again, a nodo's inhibited state is ignored by 



originating node to the receiving node is T • Note that a gx’id node 



tho correlation driving fxinction 
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Thus tho 



z processes can only have non negative values. For this reason this 
outstar is an oxcitory biased nachinc, V/e Trill havo occasion in a 
later chapter to investigate outstars vrhich allow negative z processes 
and arc Kore neutrally biased. 

The rationale for lateral inhibition is to have a responding 
grid node inhibit all the other grid nodes. When several grid nodes 
are responding at the saiae time, wo expect the node responding with 
the greatest amplitude to inhibit the other nodes the most xrhile 
suffering the least inhibition itself. When a random mistake occurs 
in a previously learned ps.ttem, the prediction signal inputs to the 
grid nodes vrill cause the nodes corresponding to ©vents in the pattern 
to respond vrith greater amplitude than the nodes corresponding to the 
mistake. This xrxll result in inhibition of the response to the mistake. 
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section ^!-c2 



Exporinental Study of an Out star vrith lateral 
Inhibition 



To tost the claim that the latorallj'’ inhibitiiig outsrtar ha.s good 
noise resistances, vre shall repeat oxperteent II which itaxs pei'fonaed 
with the simple oxitstar# All the parameter specifications for that 
experiment x-nJ-l be retained. However, v:e have two new parametei’s to 



specify, and T“ • 

If it takes too long for inhibitoiy signals to travel along their 
directed edges, then vro shall have defeated the pxirpose of latcra,i 
inhibition by having inhibiting signals arrive after the damago has 
been done. Thus X~ should bo small, lateral inhibition would bo 
most effective if Y" = 0, bud- vre shall obseivo the constraints on 
transmissions a.long diu’ected edges sot up in cha,pter 1, With those 
arguments in iTjnd, x:' is selected to bes 

r' 



i 

3a 



= X;; = -X 

36 ^ 



A rational guess for ^’is difficult. In order to specify it most 
efficiently via vrould need some idea of the average number of grid events 
composing a pattei‘n and the average macber of events that compose 
a random raistake. The reason for dosiriixg this information when 



selecting is obvious i Svxpxjose that we had tvro patterns vre wished 

to teach to tvro outstars sharing tlio same grid. Pattern is composed 

of ono event. Pattern 'o ^ is composed of n events vrhore i < n < N 

and N is the number of grid nodes. Then tho nodo corresponding to tho 

event in pattern 0^ will not be inhibited at a3.1, Hovrovor, each of 

tho nodes corresponding to events in 0^ willl inhibit each othei' and 

the node rcsjxnises to '& vrill have a diminished fcmplitxxdo. Thus the 

8Z 



z correlations for 0^ be si:ia3-ler and it vrill require many more 

instructions to learn eg than it vrill require to learn Any 

selection for \Ti3.1 work vroll in learning 6^, On the other hand 
an excessively largo vrill resu!l.t in very inefficient leaiTiing of 




Howevsi’j, vre want (f large enougli to Inhibit randora mistakes . 

Thus we are faced vrith a trade off betvreen inefficient learning and 
the proper degree of inhibition to counter mistakose A fore-lcnowledge 
of the average situation to expect would greatly aide in the proper 
selection of « Of course , if vre vranted our outstars to bo 
completely unbiased at the beginning of the experiment} vre covild make 
a largo number of them irith variovis yG" and turn them lose in the 
environments Survival of the fittest would soon select the optimal. 



fi . 

For the purposes of this study, it vras decided to select ff on the 
idea that at most two events irould compose a pattern and a random 
mistake on the avexvigo vrould consist of one events vs.s chosen to 
allovi the inhibitory signal from excited nodes to drive an unexcited 
node to approximately one“}ialf the amplitu.de of the excited nodoc 
A brief analysis v;as wade to meet this criteria as follows? 

llaximivm amplitude of an x process excited by a rectangxilar pulse 

A ^ 

of amplitude A. and duration 6 = i/a vras i?iax(x^(t)) = (A/a )(1 •> g”"^) = 
0.63 A/cc. , 

Amplitude of such an input resulting in (i/2)max(xj^(t)) is (l/2)A, 
P‘(0s63)(A/a ) rr (i/2) A 

or /3 = ^/ie26 



for a = 3c 333, p>' - Z.Ov 
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An experimental check of this rosnlted in “2,38, Tho 11^ 
error if; due both to the naivete" of tho analysis and errors inherent 
to the digital simulation. 

Inadvertently, v was changed to 2,4 resulting in a well learning 
in one and one-half presentations criteria. This minor descrepeney 
is hot sufficient to prevent comparison viith experiment II, For 
convenience the major pa-rameters used are listed hero: 

Netwoi’k parameters: 

a 3.3333 sec. i/<S 

fi = 1.0 

u = 0,558 sec,” 

V = 2,4 

0,3 see, 

“ 0.1 sec, 

2.38 

Input pulse parameters: 

A - 10 

^ = 0,3 sec, = l/(X 

The equations governing the performance of the laterally inhibiting 
outstar are: 



X (t) = “OCX (t) -f- P.(t) 

^ C C 

Xi(t) = -0CXj,(t) t P^(t) fi 

z^.Xt) = “UZ^^(t) '!■ v[xg(t -'t )Xj^(t)]'^ 




Experiment III was begun by teaching the outstar the pattern 
V -f-Y by tv70 presentations of event 2, 'V time units after presentation 

C 9 

Figure 4,2,1 shovrs tho result, Thr prediction 



of tho coiiEiand event. 




t 









#*■ 



TIME (secs) 



response on 'the x„(t) ti*ace shous chat V \ras vrell learaed as 

/i c 2 

is indicated by the z ^(t) trace. Figure 4,2,1 should be compared 

c2 

vrith figxu’c 3‘3el v.diich shows the i’osiiJLt of tho same pattern being 

taught to a practical simple outstar. 

Of interest isi figure 4,2,1 is the fact that a minor association 

of V with V did not occur to any significant extent. Event 1 was 
c j 

presented with prosontation phase (p = 0 with respect to arrival of 
the prediction signai. at the arrovjheads. That iSp event 1 vras pre- 
sented at the exiict tiiae instant tliat the prediction signal ai’rived 
at tho arroT)hsads, In the discussion of the phase-correlation curves 
of section 3»5» we saw tliat presenting an event with presentation 
phase <p = 0 results in the greatest increase in tho anplitixdcs of 
the z process. In this sensOf an event presented irith presentation 
phase <p = 0 is leamicd best. Events presented with presentation phase 
<P ^ 0 are learned to a lessor extent. In figure 4,2,1, event 3 “ 
was presented 0,6 seconds after event 1, That is, event 3 was pre- 
sented vrith presentation phase ^ ~ 10,6 seconds = 2S - 2/oL , From 
tho phase-correlation cuiwe for a simple outstar ijithout thresholds 
in section 3«5t we saw that presenting an ovent irlth presentation 
phase cp = +2 S “ '^Z/cL resulted in a significant increase in the 
associated a process’s amplitude. The addition of thi’esholds to the 
simple outstar prevented any increase 3n the. associated z process by 
cutting off tho "tails" of the x processes, 

Tho laterally inhibiting outstar cor-rontly under study does not 
have thresholds. However, from the fact that only a very minor associ- 
ation V — v;as Icamc-d in figure 4, 2,1, it appears tliat lateral 
inhibition has some of the came effects that thresholds have on the 
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perfomance of an outstar. Wo will investigate this performance in 
detail in section 

Experiment III was continued to check the claim that lateral 

inhibition increases resistance to randcca laistakeso Figvire 4,2<,2 

shows the result of presenting the previously learned pattern V 

vrith a simulated random mistake, event 1, As the x, (t) and z ^(t) 

1 cl 

traces show, the mistake was inhibited to tho point vjhere 

was learned to only a minor extent o (Compai’c to figure 3*2o3o) 

Additionally, predictions following tho mistake presentation I'osultod 

in the Xj^(t) process being totally inhibited. This result od in the 

memory of the mistake deca 5 >iiTig towards extinction, as shot-m by the 

Zci(t) trace. Of course the dramatic results sho’cm in figure 4.2.2 

viere due to the comparative freshness of tlie pattern ill "the 

out star's memory as shown by the large amplitude foi* z^g(t) when the 

mistake vras presented. Tho memory of V — c*V* wi3.1 fade as z ,(t) 

c 2 c2 

decays. If tho memoi=y is sufficiently faded, wo vrUl not expect the 

resistance to random mistakes to be as good. 

This has parallels in every day experience. Students are less 

D.ikely to ba deceived by a tric)<y question in an examination when the 

subject matter is fi'esh 3n their minds. 

lateral inhibition does not prevent tho outstai’ from correcting 

a previously learned pattern which is 3n error. Ejq)erii?.ont III was 

continued to convorb the previously learned pattern V — «** V with a 

now pattern V — t>-V , Figure 4.2.3 shows the resuJ,bso As can be seen, 
c i 

two presentations of the new pattem woro sufficient to tota3.1y 
inhibit tho old patterii and insui'o errtinction of its meraoi’y. 
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section ^1.3 



Advantage of Correcting a Learned Mistake trith 
Lateral Inhibition 



In section 30 ^+ discussed the effects of the forgetting rate 
u on a siaplo out star's resistance to random mis’takos and on its abili- 
ty to correct learned mistakes , The conclusion was tha.t a small u 
resulted in good random mistake resistance, but vciy lot-r correctability 
A largo u liad tho opposite effect. From the outstar's point of view^ 
the only difference between a random mistake and a correction to a 
previously learned pattern is tliat the random mistake occurs infre- 
quently vri-th the coviimand event vjhereas the correcting pattern usx^ally 
occurs with the coriimand event. It was shoxai in section 3»^*' that the 
outstai* remembei'ed the difference betvreen an event vrhich iiifrequently 
occurs vrlth the command event and one \jhioh usually occurs viith the 
coiETxand event in tho accuiaulated past experience contained in tho 
amplitude of its z processes, VJith a small u tho past experience 
v;as not forgotten rapidly and resulted in a great accuraulation of 
experience. It was not surprising that an infrequent variation in the 
pattern had a small effect on the accumvilated experience. On tho other 
hand, a great accvmiLatdon of past experience vrith a pattern raakes it 
very difficult to convince the outstar that the pattern was an error, 
I>uo to tho fast rats of forgetting past experience in the large u 
outstar, little accumulation of experience occured resulting in its 
random mistake resistance and correctability properties. Thus by 
interpretirig the amplitudes of the z processes as accumulated past 
experience it seemed very reasonable to conclude tliat good random 
mistake resistance and correctability were incompatible. 



Figuro 4e2,2 and '4-c2.3 show that this neod not bo the case# 

The laterally inhibiting outstar lias both good resistance to 

randoKi xiiistakes and good correctabil.ity* Lateral inhibition vras 

introduced to make an outstar vrith a fast forgetting rate raore 

resistant to random raistakes. It taight have been exjDscted that this 

wovild decrease its correctabiIl.ity, VJe shall inquire vhy it did not. 

In the slovrly forgetting siiaple outstar the only way a pattern 

can be cori’ected is by brute force. The amplitude of grid node x 

process responses is a linear function of the sura of the event input 

pulses and prediction sig-nal inputs: 

X.(t) = ~ax,(t) + /3Z .(t)x (t ‘-'t' ) *!- P.(t) 
a 1 I cx c X 

Thus the amplitude of a grid node x process response is greater vrhon 

there is an event input pulse than when there is only a prediction 

signal input alone. Therefore the correlating signal vx (t - 'T )x.. (t) 

C J- 

is greater vihen there is an event input pulse and the z process grows 

faster. In correcting a pattern in a slowly fox'gotting simple o’atstar, 

vre simply stop presenting the events of the erroneous pattexvi and 

start presenting the events of the correcting pattern. As vjas shown 

in figure 3o2,2, the additional amplitude of the gi’id node x processes 

due to the correcting event input poises i*esults in the z processes 

associated xrith the correcting pattein events grovring faster than the 

z process associated vrith the erroneous pattern. By tlxe outstar theoi*era 

x^e are assured that eventually the probabilities X.(t) and y.(t) xoll 

go from values describing the erroneous pattern to values describing 

the correcting pattern. However, we have seen that in the sloxrly 

forgetting outstar, tho x process amplitudes will have become impr-ac- 

tica,Uy lai'ge long before this happens, 

9 / 






In the rapidly forgetting out star, v?o do not have the problem 
of impractically large amplitude x pi’ocossosc Furc-hoi', in ti’ying 
to correct a previously learned pattern \7q arc axded by the rs-pid 
forgetting rate* In addition to the efxects of tho bruoe force 
correctting process, the rapidly forgetting outstar forgets tho erroneous 
pattern -while it is learning tho correcting pattern, (Provided of 
course, the excibs-tions of the command node are spaced far enough 
apai-t not to result in significant pumping up of the erroneous pattern.) 
Thus in addition to the active process of forcing the z process associ- 
ated \Tith the correcting pattern to grow larger tiian those associated 
with the erroneous pattoim, there is the passive process of forge-tting 
the old pattern. As has been emphasized this passive forgetting 
process results in the better correctabilx'cy of the rapidly 
forgetting simple outstar as vzell as its lo-w resistance to random 
mistakes. 

In tho laterally inhibiting outstar, we retained the fast 

forgetting rate to control grid riode x process ampli"c-ucies, ihus we 

have both the active bimto force correcting process and the passive 

forgetting process working to correct an erroneous pattern. If we 

look closely at figures 4,2,2 and 4,2,3, can see the effect of 

lateral inhibition in both random mistake correction and pattern 

correction. In figure 4,2,2, presentation of the previously learned 

pattern V — o-Y irith tho sixaulated random mists-ko rosxilted in 

c Z c J- 

grovrbh of both z ,(t) and z ,(t). However, the stmi of the input pulso 
^ c2 cl 

P2(t) and the input prediction signal /3 ) drove 

to a greater aiiiplitude than x^|^(t) -ijas driven by alone. 

Therefore diminished by tho inhibiting sig’oal from x^(t) 




i 

d 




and z (t) did not grow to a very large ariplitvidoc Both z .(t) and 
cl Ci 

K (t) decayed. On subsequent predictions the prediction input signal 
for V i^s not sufficient to overcome the inhibitory signal, frora 
and the correlating signal v|x^(t - t' )xj^(t)^ ' ms zero. Thus 
was unable to grow on subsequent predictions and the fast forgetting 
rate insured that the random mistake would be totally forgotten. 

V/e have said that a random mistake occurs infrequently. Thus 
vjo can expect tliat the fast forgetting rate vrill insure that the random 



mistake will be forgotten before it occurs again diri'ing presentation 
of the pattern and there rrill bo no accvimulation of experionee vrith the 



mistake ‘jn z ,(t). how, look at the successful correction of the 
cl 

pre\dously learned pattern V y with V — in figui’o ^.2,3» 

It is seen that on the fla*st presentation of the correcting pattern, 

the accumulated experience with the erroneous pattern v?as still great” 

er than the experience accumulated on the first presentation of the 

correcting pattern. At this point the outstar could not be ax-jare 

tha.t Y - — is a correcting patto'm and not a random inisbake. However, 

on the next presentation of , the expoi’ienee nox-r accumxalated 

vrith V — , coupled xritlx the event input pxxlse, is sufficient to 
C 1 

drive Xj^(t) to a greater amplitude t!:ian prediction alone can drive 

X (t). Consequently, x (t) is inhibited and z (t) groxfs very little. 

^ 2 . c2 

The fast forgetting rate now tlnsures t)iat xrill decay to a point 

xrhere a third presentation of the correcting pattern xrl-ll completely 



inhibit prediction of and it is Impossible thereafter to 

acc'omxilatc any more experience xrith V — V by pi-cdiction. At this 

c 1 

point xxe can* say that the pattem has been corrected. 
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It is a combination of brute force correcting I’Gsulting in ac“ 
emulation of experiejice with the correcting patterrj, rs.pid foi’getting 
of the ei'roneous po.ttern’, and use of accumulating experience to 
inhibit the erroneous patter-n which accounts for the correctability 
property of a laterally inhibiting outstar. The sard© combination 
of processes rcs'alts in its random mistake resistanceo It is the 
inability of a siraple outstar to couple accumulation of experience 
with forgettijig that resu!.ts in the incompatibility of random mistake 
resistance vnlth correctability o 

Because of the inability to control the arap3.itudes of grid node 
responses vrlth sjnall u's vre diill not xuidertako to study the variation 
of these properties in a latoral3.y inhibiting outstar vrith a fast 
forgettrlng rate* In chapter six* we v:jll present a different fozviu- 
lation of the outstar equations which control the amplitudes of the 

grid node responses independent of the amplitudes of the z processes 

and incorporate a form of lateivtl inhibition. At that tiiae vre \dll 
consider the effect of deci'easing the forgetting rate on the properties 
of a laterally inhibiting outstar. 



section 



Furiiher Remarks on Loc-al Lateral Inhibition 



In the first part of oxpcriiaent III, figure it was noted 

that lateral inhibition appears to have some of the same effects 

tliat thresholds have on the performance of out stars. The evidence 

ms tliat presentation of event seconds = 2 S ~ Z/ CL , after 

arrival of the prediction signal at the arrowheads did not result in 

any leai'n:big of V V^, Further investigation shows that this result 

c j 

is of dubiovis value. 

The 3C^(t) trace :’m figure 4,2,3 shows the inhibitory response of 
a node to a single input pu3.se at another node in the grid. The 
maxicrom of this inhibitory response occurs approxiraately 2 8 ~ Z/CL 
time units a.fter arrival of the inhibitory signal at the node. Thus 
the maximum inhibitory response occurs at approxiraately 2/oC. t5mo 
units after beginning excitement of the other node, Tho result is 
maximum ijihlbition of events presented a 3.ittle less tlian 'C~ i 2 5 



after begiiining excitation of a grid node. Now, ii“ an event has been 
px'esented T'i 2 5 before arrive.1 of tho prediction signal, the event 
presented v;ith cp = 0 presentation phase I'elativo to the arrival of the 
prediction signal at the arrovjheads would have been most inhibited and 
little learning of this event would have I'esttLted, In ef.fect, this 
means that to avoid inhibiting an event to bo associated with tho 
command event of one out star sharing tho grid vrith other outstars, 
tho intei’'\ral between event pi’esentations mxisb be gi'eater than approx- 
imately T" i 4/oL time umits. 

The reason for the maxinium of an inliibitosry response occuramg 
so long after excitation of a nodo can be seen analybically. If the 
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total input signal, I^(t), to an embedding field netvronk node, V^, 

is a linear, timo invariant function of time, then the node’s x process 

has a transfex* fxmction i/(s ■ta), such thats 

X (s) " (I.(s))/(s 
i ^ 

A cascade of n nodes has a transfer" function of (l/(s '5'CL))^, Due 

to the short duration of our input pulses, we are dealing ossentia3.1y 

with the transient i'‘osponse of the x process. Thus the ti'ansforiii of 

i/(s a is a good indication of what our pi;G.so should look like 

after having ti’aveled thi'ctigh n nodes, 

1-1 



(s iaP T^fjr aoc^"^ Ver^V = 



X (t) 

n 



The idaximuDi of this occurs at: 



dx (t) 
dt 



-•= 0 




- Thus the more nodes a pulse ti’avels through, the later its rcaxi-mum 
occurs. Of course, the ixiput signal to a grid node in a laterally 
inhibiting out star is partially non iDnear, However, if v;e consider 
that the z process vary sloixly enough so that wo can considex' them to 
be approxiraatoly constant, tlion the above analysis approximately holds. 
Thus jn the case irhei'e an input is given to one nodo vrhich inhibits 
another, vre have a n = 2 nodo cascade-, andJ 

Xg(t) = Vte“*^^ 

with raax'hnxji-a ats 

t = i/(X after arrival of the ijihibitox-y signal. 

Now, if we add a px’ediction signal from the conimand node also, vro have 
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effect here is too large to learn a three event pattern efficiently. 



a n " 3 node cascade ands 
x^(t) ^ •V(l/2)t^e“''^ 
vrith inaximm at t 

t ~ 2f CL after arrival of the jr>hibitoay signal. 

Thus the occurance of the waxir,im inhibitory response botvreen 
l/oc and 't -i- 2/(X is inherent to tho network according to the 
approxiiiate analysis. The experimental evidence shows that this 
approximate analysis is reasonably correct, VJe will have further 
occasion to consider this "lengthoniig" of pulses as they go through 
successive nodes when vre study outsta.r avala-nchos® 

Tho earlier prediction tliat a p~ suitable for learniig a pattern 
of one event results in inefficient learning of patterns with more than 
one event vjas tested, A N " 4 grid node laterally inhibiting outstar 
vjas used, v = 3«^ selected to result in well IcarnJaig of one 
event in one presentation and this was experimentally vei'ified. All*” 
other parameters wore tho same as in experiment III, The initial 
conditions on the z processes \-ieve reset to zero. Throe events were 
presented to the grid^ time vmits af-ter excitation of the command 
node, A prediction v:as requested i/u time units later. The results 
are shovm iji figure The pattern Vg, V^) vras learned 

very poorly. From this evidence it can bo concluded that it would 
require many more rapid presentations of this pattern to result in 
well learning. 

Of cou 2 *se with lateral Inhibition any /3">0 vrill result in faster 
learning of a pattern •vri.th fewer events. If wo consider the ntimber of 
elemental events as a measm’e of the complexity of a pattoim, then this 
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effect translates into the statement that a coiaplicatcd pattern 
is hai-dcr to learn* A laterally inhibit'Jng ovitstar has some of the 
same drav;backs as the human mental process. 
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CHAPTER 5 THE OUTSTAR AVAUNCHE 
section Introduction 

In section 2 A the out star avalanche i:as brief?^y intx-oduccdc 
Its geometric sch^natic and equation irore sho’.-rn in figure 2»i.2 
which is here repeated for convenience. The basic idea behind the 
avalanche is to arrange the comriiand nodes of many outstars in a 
linear cascade. Excitement of the first node in the cascade results 
in a prediction signal ai-riviiig at the jth command node of the cascade 
j T tirao xinits later. Thus ea-ch outstar in the avalancho takes a 
picture of the tisae varyj-ng pattern on the grid at integer riiultiples 
of r . The result is that the avalanche can learn and reproduce a 
sampled data approximation of a time varying pattern of events. The 
starting command node in the cascade represents an event which is 
associated vjith the start of the time vaigring pattern. 

The linear coiPjuand node cascade essentially acts as a clock to 
determine when the data samples are taken. In order to perform the 
function t’o would viant thr response of each node in the cascade to 
the prediction signal from tiie node immediately before it to bo 
approxiDiately the same as every other node. This is, however, not 
the case xjith the outstar avalanche arrangement shox-m in figure 2,1,1. 
The reason T?as discussed in section ^-.4 vjhero wo noticed that the 
response of nodes in a cascade got longer the more nodes a signal 
passed through. Based on the transient response of such a linear 
cascade, we analytically computed that the maximum of the nth node's 
response occvu'cd at (n »• l)/<^ , A short experiment was conducted to 
test this result. Figure shows a linear cascade of foui* nodes 
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EQUATIONS GOVERNING NE'IV,OhK FERFORVANCE 
2 . 1.4 ~ ^ 

2.1.5 X -;(t) = - ax .(t) + Ax . .(t -t) for 1< ii* M 

Cl Cl I ci-1 

2.1.6’ x^(t) = -ax.(t) ♦ /3 § z„j ^(t)x .(t -X) + P,(t) 

0 0 ci,j Cl 0 

for 1 i- j i' W 

2.1.7 z . (t) = -^ uz (t) + vx ,(t -X)x (t) 

ci,j ^ ci,o Cl j 



Figure 2.1,2. An outstar avalanche and the equations 
governing its performance. 



Figure Response of an outstar avalanche coinmand noae cascade. 
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As can be seen from the 



excited by a rectangular pvilse at node 

traces x (t) through x , (t), the responses did lengthen by approx- 
cl w 

imatoly (n-l)/oc • The equations used in this experiment were; 

X . (t) = -> ax (t) P.(t) 

cl cl C 

X (t) = - ax .(t) t fi X . . (t - r ) for i 2 
ci Cl I d-i 

The growing amplitude of svi.ccessivo node responses in figure 
5#1.1 is due to the fact that ^ VTas selected to restilt in the x «(t) 
response being of approximately the Scune maximum amplitude as the 
x^^(t) response. For the parameter selection shovm in figui'e 
this resulted in a |3 of: 

q 

(1 - e'b 

However, the steady state response of a node with transfer function 
l/(s + (X) to a step input is to amplify the step's amplitude hy if CL , 
Thus, in order to maintain approximately equal amplitude responses iif 
a cascade, should be selected to be 
P = (X 

The fi in the experiment sho^m in figure ^»1,1 \r$.s too large and re- 
sulted in the araplitude grovrth shown. 

The inadvortant amplitude growth in figure does not detract 

from the basic result, A linear cascade of command nodes for an ava- 
lanche is imsatisfactory due to the progressive lengthening of command 
node responses. In fact, this effect renders a complex nettrork of 
embedding field elements requiring transmission of signals through 
many nodes rather .impractical. In a later chaptei* vze shall address 
this problem- directly, but for the time being wo shall side step it 

by introducing a differently configured avalanche, 
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a long axon, and colaterals. Note that jT is the time ellapsed from excitement 
of the starting command node to arrival of the prediction signal x^(t -jf) 
at the arrowheads of the colateral group. 



I 




Figure 5»i»2 shows an avalanche which performs the same theore- 
tical function as that pictured in figure 2«i«2 without the pulse 
lengthening effects. The neurophysiological names given to the now 
elements of figure 5*1 *2 were suggested by the geometric arrangement 
of the neinrous system in the cerebellum of vertebrates. The long 
axon is a long directed edge. At periodic points along the long axon, 
the directed edge splits into a continuation of the long axon and a 
group of N branches of the directed edge called a collateral group. 
Each of the collaterals has an arrowliead impinging on a grid node. 

The distance from the starting command node, V , to the arrowheads 
of the jth collateral group are so arranged that the time elapsed from 
excitement of the starting command nodo to arrival of the prediction 
signal at these arrovjheads is jT time units. In each collateral 
arrowhead is located a z process for correlating tho prediction signal 
x^(t - jT ) Td-th the gx’id node responses. This long axon and 
collateral geometry performs the clock function of the avalanche. 

For ease of reference, the equations for this avalanche are 
given here: 

5.1.1 X (t) = - ax (t) + P (t) 

c c C ^ 

5.1.2 x.(t) = - «x.(t) + P,(t) + /3 S z _(t)x (t - jr ) 

^ 1 1 ' j-\ ji c 

5.1.3 z. (t) = -uz..(t) '1- vx.(t)x (t - jr ) 

JJ. Jl J- C 

Eqviation 5.1.2 is for the response of a grid node in a siriiplo outstar, 
V/e vdll perform a simple experiment on an avalanche vrith this foim- 
ulation and then change equation 5.1.2 to incorpox'ate lateral inhibi- 
tion in our avalanche. The two avalanches thus fomned vdll bo called 
a simple avalanche and a latei'ally inhibiting avalanche, - 
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Time does not permit an exhaustive study of avalanches. This 
chapter on avalanches is an illustra-tion of the i-esults and problems 
of using the out stars studied previously in an avalanche. 



section 5*2 



A Simple Avalanche 



In this section \ie yri-ll u.so a simj^e avalanche to leani a time 
varying pattern of events. In designing a simple avalanche to do 
this I we must first ask what sort of time varying pattem are we 
going to have it learn. If we have M collateral groups in our 
avalanche and N grid nodes, we must keep track of M x N z processes 
during the experiment. To conserve computation time, M x N should 
be small. An avalanche with M = 3 collateral groups and N = 2 grid 
nodes is chosen. It would be rather unrealistic to expect an avalanche 
which takes only three sample data points to approximate a continuous 
time varying pattem. Thus we i^ill try to learn a series of tirae 
discrete events. That is, i^e allow the possibility of the occurance 
of the two events associated vrith the grid nodes in the environment. 

We assume that the events represent time discrete events such as de~ 
pressing the key of a piano. Wo further assume that there is a min- 
iraxm time betvreen occurance of separate patterns of these events and 
we synchronize the avalanche's sampling interval 'T vrith this minimum 
interval. To simplify the experiment still further, wo shall indicate 
the occurance of these events vrith eq^xal amplitude rectangvilar input 
pulses to the appropriate grid node and foUovr the convention of the 
past chapters by making the pulso duration S equal to the rise time 
of a node's response: 

S= 1/oL 

With this specification of the allowable input patterns, v;e have 
made the results of the previous cliapters applicable to the avalanche. 
The other parameters vritl be specified accordingly: 
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Figure* 5.2,1, Results of an experiment viith a simple outstar avalanche. 
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= 3*3333 sec, 

(i = l 

V = 1*6 (two presentations for well learnijig criteria) 

A = 10 

S = l/a = 0,3 sec. 

We want 'I' to be large enough to a.void significant over lapping 
of the "pictures" taken by each collateral group. From the phase- 
correlation curves of section 3*5i T=3/^ ~ 3S f should work. 

Thus nf is selected to be: 
r ~ = 0.9 sec. 

The meraory decay time l/u is specified to be the time between 
successive presentations and/or predictions of the pattern. Thus: 
u = 1/4 sec. = 0,25 sec, 

Figtire 5*2,1 shows the pattern presented to the avalanche and the 
results. The pattern vias presented twice. Symbolically, the pattei*n 
presented was: 

Vc— (V^, V^), (V^, 0), (V^, 0) 

The grid node responses following t - 8,8 seconds are the avalanche's 
learned prediction of the pattern ellicited by the excitement of the 
starting command node alone at t = 7,9 seconds. 

As can be seen, the avalanche's prediction is not an ruiqualified 
success. Of course a is too srioJ.1 to approximate the input pulses 
vrith any degree of accuracy. Nonetheless, grid node did I’ospond 
with two largo amplitude responses in a i*ow and grid node responded 
vrith large responses spaced 2 apart as in the input pattern. However 
the third response of x^(t) and the second response of shovr 
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that the avalanche has noticable "picture over lapping" error problems. 
Increasing a and/or using thresholds would result in a bettor approx- 
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section 5*3 



A Laterally Inhibiting Avalanche 



Although the results \7ith -o. smple avalanche vrero not encour- 
aging, equations 5*1 were modified to produce a lateially inhibiting 
avalanche for comparison. To convert a simple aTOlancho to a laterally 
inhibiting one, inhibiting directed ed.ges between the grid nodes must 
bo added and equation changed tos 
5.3»1 X (t) = -ax (t) + P (t) 

c c c ^ 

5.3*2 x^(t) = -ax^(t) + Pj^(t) + ^Szj^(t)x^(t - jT ) - 

■ /3’ iuv* -■^■> 3 " 

5.3*3 Z..(t) = -U7..(t) v[x.(t)x (t - 0T)3 

where: 

y if y > 0 
0 if y £ 0 

Figure 5*3*1 shox^s the rosvilts of performing the experiment of 

section 5*2 on a later^.lly inhibiting avalanche. The parajneters used 
in this experiment were the same as those in section 5*2 except that 
v = 2,4 as in the studxr of the latex'ally inhibiting outstar, ^ and 
are the same as in that study: 

't" = 0.1 sec. 

P" = 2.38 

The prediction response of the grid nodes folloxring t = 8,8 
seconds in figure 5*3*1 shoxjs that the pattern learned by the avalanche 
is definitely not the pattern taught to it. Briefly analyzing the 
reasons for this faHuro, we can see that the deleterious effects of 
lateral inhibition all acted in concert. Firstly, the fact that 
lateral inhibition diminishes the araplitude of a node’s I'osponse 
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Figure 5* 3*1. Results of an experiment with a laterally inhibiting 
outstar avalanche. 
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when liiore than one node is oxcited at the same tiiae resiilted in 
responses to presentation of both of the events at the same time at 
the beginning of the pattern being diminished. This resulted in a 
smaUer correlation amplitude for and when compared to 

z^j^Ct) which vjas the result of the uninhibited response to the pre- 
sentation of event i alone as the second event of the pattern. 

The first two responses of the prediction response of x^(t) show 
this effect. 

Secondly, the lengthening of the negative amplitude inhibitions 

responses due to transmittal through sevei^ai nodes resulted in a large 

inhibitory response in x (t) when event 2 ^-jas presented alone as the 

third event of the pattern. This rostilted in a small correlation 

amplitude for z^n(t) vrhich was insufficient to drive x (t) positive 

2 

in the prediction at the appropriate time. 

Additionally, the errors associated vrlth "picture over lapping" 
combined iiith the above resvilted in x^(t) responding to a third event 
that was not in the pattern, 

. If an attempt were made to improve the laterally inhibiting 
avalanche's performance, fi should be I’educed, It is noted that if the 
pattern bad been coraposed on the average of a large number of events 
at each sampling vTith only a few events changing between samples, the 
amplitude diminishing effect trould not have been as sei'ious. Due 
to the large nxmiber of nodes iii such a pattern, the resistance to 
random mistakes composed of a small mxmbor of events would not bo 
compromised vrith a smaller , 

Both to avoid the inhibitor^’’ response lengthening and "picture 

over lapping" cri*ors, the intei'val botvTeen samples, IT , sho'old be 
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increased. Of course this last suggestion seriously cOiKproraises the 
ability of a laterally inhibiting outstar to accurately approximate 
a rapidly vai’ying pattern. Thus solution of the response 
lengthening problem of a signal that must be transmitted through 
several nodes is important, A solution will be projxjsed in a later 
chaptoi’. 

The avalanches presented in this chapter vTore for illustrative 
purposes to shox-f some of the problems encoxintered when outstars are 
combined into an avalanche. Rather than dwelling upon the design 
impi'ovements which could be made to the avalanches, vie will go on 
to consider other fomulations of outstars vihich aro the basic com- 
ponents of an avalanche. 



CH^JTER 6 



THE VIRTU/iL MTERALLI INHIBITING ODTSTAR 



section 6*1 Other Oivt stars VJhich Control the Maximicn Amplitudes 
of Grid Node Responses 

Lateral inhibition was added to the siriiple outstar as a means 
of using i>ast experience to suppress random mistakes in a pattern. 

Its addition was necessitated by the rapid forgetting rate required 
to control the amplitudes of prediction responses. There are methods 
by vrhich the amplitudes of prediction responses can be controled 
other than by allovring a fast forgetting rate, V/e vjill review a few 
of them as illustrations of different formulations of tho equations 
for an outstar and then investigate one of them. 

One method of controling the amplitudes of prediction responses 
would be to place an upper bo\md on the z processes: 

6.1.1 X (t) = -ax (t) ■'r P (t) 

c • c X 

6.1.2 x^(t) = -ax^(t) Pj^(t) + /3 - T ) 

6.1.3 " z^^(t)] '*'vx^(t)x^(t - "Z” ) 

where: j. fy if y > 0 

[yj = 

(.0 if y f 0 

Equation 6,1,3 limits z .(t) to values between 0 and M , M is 

Cl • z z 

specified such that ^ M^x^Ct - t ) produces the maximum grid response 
amplitude we are vjillang to tolerate. This method has limited random 
mistake resistance, Hovrever, if we specify v such that it requires 
several presentations of a pattern to drive a z process to then 
the occurance of one random ir.istake vTill result in a relatively small 
z amplitude. If u is specified to result in a memory decay time l/u 



approximately equal to the average time iriterval between consecutive 
occuranccs of the same random mistake, then equations 6,1,1 through ^ 

6.1.3 describe an out star which has a relatively slow forgett?jig rate 
and amplitude control of the grid nodo responses. However, if an 
outstar governed by this set of equations is confronted vrith a random 
mistake and is then asked to predict the pattern rapidly for a 
prolonged period, we can expect the pumping up process to saturate all 
the z process at value M^, including the z process associated with 
the mistake. Thus upper bounding the z processes to insure that 

the amplitudes of prediction responses remain tolerable is not very 
useful for an outstar functioning in a noisy environment. Additionally, 
we could expect that use of a small u would result in poor correcta- 
bility as in the simple outstar, 

A more direct method of controling the amplitudes of predictions 
responses would be to upper bound the grid x processes: 

6.1.4 X (t) = »(xx (t) + P (t) 

c c c 

6.1.5 Xj^(t) = -aXj^(t) + - Xj^(t)]’*^(P^(t) + - t : )) 

6.1.6 z .(t) = -uz^.(t) *!• vx (t -T )x (t) 

Cl Cl c i 

By specifying u in eqviation 6,1,6 to be small, the outstar gov- 

emed by equations 6,1,4 through 6,1,6 would bo able to absorb random 

mistakes in its experience as did the simple outstar with a slow 

forgetting rate in chapter three. The bound on the grid node's x 

processes in equation 6,1,5 instires that this outstar will not have 

the uncontroled groxTth of prediction responses that the slowly 

forgetting simple outstar did. However, in this outstar, a large 

z^^(t) vTotild' result in a maximum prediction input signal to a grid 

nodo of magnitude M for as long as z .(t)x (t •• T ) — M , Because 

z ^ ^ 

He 



J 





the prediction signals have exponentially decaying tails, this would 

result in the effective duration of the maxiaua prediction signal 

input getting longer as the z^^(t) process got longer. Thus while 

being able to control the amplitude of grid node prediction responses, 

we would not be able to control the duration of the responses. In an 

out star, we have absolute control over the shape and amplitude of the 

prediction signal x (t - T ) by control of the input pulse to the 

command node. Thus by specifying the input pulses we can analytically 

compute what the prediction signal looks like. With this knowledge, 

a threshold T] could be placed on the command node to guarantee that 

the prediction signal [x(t-T ) -7?]"* is non zero only over a 

specified interval of time. By so restricting the duration of the 

prediction signal we could also limit the duration of the gi'id node's 

prediction responses. Again the small u resulting in good random 

mistake resistance could be expected to result in poor correctability. 

The properties of such an outstar v;ould be interesting to investigate 

but time did not allow an investigation in this study. 

Another method of conti'oling the grid node prediction response 

amplitude which we vnil study would be to make the prediction inpxit 

signal to the grid nodes linearly proportional to the probabilities 

y^(t) which define the outstar' s memory of a pattern. By the outstar 

theorem, the y^(t) converge to the pattei'n probabilities 0 xjhich 

are constant. Thus when the y^^(t) have converged sufficiently close 

to the we could expect the prediction signal inputs to the grid 

nodes ^y^(t)x^(t - f ) to be the same independent of the amplitudes 

of the 2 ^^(t) processes. As y^(t) ^ 1, specifying |3 would doteraiine 

the maximm possible prediction c4ap],itudo of. the grid node's responses, 
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Additionally specifying the u of the z processes to be small would 
allow absorption of random raiatakes in accumulated past experience. 

The equations for such an out star ares 

6.1.7 X (t) = - qx (t) •!- P (t) 

c c c 

6.1.8 X (t) = -ax.(t) + P^(t) + / 3 y.(t)x (t - r ) 

i ^ -i* f X c 

6.1.9 z (t) = -uzc-tCt) *5- vx (t " r )x (t) 

ci c ^ 

6.1.10 y^(t) = z At) / ( S 

Another attractive propei^y of an outstar governed by those equations 
is that equal prediction signals vrill result in eqxial grid node re- 
sponses independent of the amplitudes of the z processes. Thus we 
could say that the memory of a pattern is alirays fresh in such an 
outstar' s memory and pumping up is not required, 

A close examination of equations 6,1.7 through 6,1,10 shows 
that an outstar governed by these equations is a laterall.y inhibiting 
outstar. By lateral inhibition we mean the ability of a grid node 
responding xdLth large amplitude to diminish the amplitude of grid 
nodes responding i-Tith lesser amplitudes. From equation 6,1,9, a grid 
node responding trith a large amplitude vrill result in a large correl- 
ating amplitude for the associated process. This will result 

in a large probability y.(t) from equation 6,1,10 vrhich in turn irill 

i 

allow a larger prediction signal input in equation 6,1,8, At the 

same tine a large z .(t) vrill resvlt in a smaller y.(t) for nodes 

ci 1 

not responding with large araplitudes by the inclusion of z (t) in 
^ ci 

the denominator of equation 6,1,10 for y^(t). This in turn will result 
in a smaller input prediction signal in equation 6,1,8 for x (t), 

5 

As can be seen from eq'oation 6, 1,10 ^ the accumvilatcd past experience 

of the outstar in the z .(t) processes plays. a mojor part in this 

ti8 



lateral inhib5.tion and thus the p^st experience can' be counted upon 
to inhibit the effects of a randon mistake. An out star governed by I 
these equations combines absorption of random mistakes and active in- 
hibition of them. 

The major drawback of such an outstar is that it is not consistent 
with the elements of embedding field theory presented in chapter one. 
Their neat geometric elements performing one function each v/ero pre- 
sented. Because the y^(t)'s porfom the prediction signal amplification 
function for this outstar, they should be located in the arrowheads 
of the directed edges with the z processes. This raises the problem 
of how the z .(t)'s from each of the arrowheads of directed edges 
from the command node are made simultaneously available at all the 
arroT-rheads to form the y^(t)'s. We have constrained all other 
information ti’ansmissions in the outstar to finite velocities along 

directed edges. Because the z .(t)'s are instantaneously available - 

cx . 

at' all the arrowheads without any apparent means of traveling between 
the arrowheads, the y^(t) is a vii'tvial process. The outstar described 
by equations 6,1,7 through 6,1,10 is there fore called a virtual lat- 
erally inhibiting outstar. 

Although the virtual y (t) process is not consistent with the 

i 

elements of embedding field networks presented in chapter one, we 
■vril.1 study the perfomance of a virtml laterally inhibiting outstar. 
Grossberg has done considerable theoretical vrork vrith it, (Ref. ?) 

In the realm of theory, there is no reason why a virtual process should 
be excluded from consideration, A virtual process does not present 
any difficulties to a digital simulation either. Moreover, if we 
were to build electrical devices to make an outstar vjith, we would 



have more trouble engineering the transmission delays for prediction 

signals than engineering the virtua.l y (t) processes. The only 

i 

place vhere the virtual processes ai’e clearly inapplicable is in the 
nervous system of living organisms where all infoi-mation transmissions 
from one point in the systesn to another are at a finite velocity. 
VThereas a virtual laterally inhibiting outstar is not useful as a 
model for nex*'>/’ous systems, it is a legitimate device for study. 
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section 6,2 Specifying the Parameters in a Yirtml Laterally 
Inhibiting Oat star 

We ^oll perform the same experiment on a virtval laterally inhib- 
iting out star as has already been perforraed on the simple and laterally 
inhibiting out stars. Therefore the pai*anieters of the virtual laterally 
inhibiting outstar are specified to be the same as in the other outstars 
except where there are special considerations to be made: 

Input piiraraeters : 

A = 10 

S =: l/c<, = 0,3 sec. 

Network parameters: 

^ = 3.3333 sec, ^ 

‘T’ = 0,3 sec, 

N = 3 

Initial conditions on x„ and all x. ai*e zero, 

c 1 

Selection of U| v, and the initial conditions on the z processes 
will require some discussion. 

As the y.(t) are ratios of z (t) to the sm of all z (t), wo 
1 ci ci 

vremt at least one of the z to have a non zero initial condition to 

ci 

avoid the problem of dividing by zero. The initia.1 value should not 
be too large to avoid biasing the network at the beginning of the 
experiment. Therefore at leas"t one z .. idll be speci-fied to have an 

0 JL 

initial condition of 0,1, Again, to prevent biasing of the network 
in favor of predicting any one grid event, all the y^(t) should be 
approximately equal. This accomplished if the initial conditions on 
all the z , are equal. 
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Therefore: 



z (0) = O.i for i = 1. 2, 3 

ci ! 

Notice that this means that there is a non zero initial condition on the 

y.(t)'s: 

a 

y.(o)= 0.3333 for i ^ 1. 2. 3 
a 

This means tha,t the precidtion signal at the beginning of the expei’iment 
is split tip evenly between all the nodes in the gi‘id, A prediction 
made in the initial state of the experiment xtHI result in all grid 
nodes responding eqtxally. We must accordingly modify our interpretation 
of what grid node responses mean. Heretofore we have considered the 
outstar to be in a state of complete ignorance at the beginning of 
an experiment. In the simple and laterally inhibiting outstars this 
state of initial ignorance vjas specified by making the initial con- 
ditions on the z processes zero, A prediction by one of those out- 
stars while it was in its initial state resultcxl in no resiX)nse of the 
grid nodes. Thus we were able to re-enforce our interpretation of 
initial ignorance by saying that there was nothing iii the outstar *s 
memory and the outstar could pi’edict nothing. The virtual laterally 
inhibiting outstar does not have this nicety. 

We will interpret the prediction responses of a laterally 
inhibiting oxitstar to indicate total ignorance if all grid nodes respond 
with the same amplitude. Equivalently, total ignoi’ance is the state 
in which all y^(t) are equal. Note that this interpretation means 
that the pattern composed of all the events represented by nodes in 
the grid is not perceivable by the outstar. Excitation of all grid 

nodes will result in the same values of the y (t) as they have initially, 

i 

This is equivalent to saying that white light is the same as complete 

!2l 



darkness in this outstar. Thus an intelligible pattern must be composed 
of fewer than N events. In o\ir experiment the pattern is composed 
of one event out of three and thus is iritelligible. 

In previous putstars, v has been selected on a so manj^ presenta- 
tions mean v:ell learning criteria. This vjas due to the fact that the 

prediction signal ajiplification pi*ocess, the z .. (t), had to gi*ow to 

OIL 

a ceidiain amplitude before a prediction wottLd drive the grid nodes to 

% 

the same amplitudes as presentation of the pattern externally would 
drive them. In the virtual laterally inhibiting outstar, this criteria 
for V is meaningless. The prediction signal amplifica.tion processes 
are the y^(t) which by the outstar theorem are always less than or 
eq\ial to unity no matter what the amiplitudes of the z processes aro. 

Thus small amplitude z processes will result in the same amplitude grid 
node I’esponses as large amplitude z processes as long as the ratios 



nothing to do ’t-ri.th the amplitude of grid node responses, 

p f on the other hand, has a great deal to do with the amplitude 
of the grid node responses. In previous outstars v:e have tried to 
control the grid node responses so that their amplitudes dujpiiig a pre- 
diction \jere approximately equivalent to those attained by excitement 
by an event. As v can not be used for that purpose in this outstar, 
we will use ^ , V/ith this intention, we run into the usiml problem 
with our outstar possessing some form of lateral inhibition. That is, 
\re wovild like to know how many events on the average compose a pattern. 
In a laterally inhibiting outstar wo saxv* that a ^ “ selected for an 
avex'age of a small number of events in a pattern resulted in inefficient 




Thus specification of v has 



/23 



learning of a pattern composed of many more events. Nevertheless, 
with sufficient instruction and/or predictions, the laterally 
inhibiting outstar is able to "woU learn" a pattern more complicated 
than it was designed to les-in. 

In the virtual laterally inhibiting outstar, we do not have this 
possibility for vxell learning a pattern more complicated than ones 
the network is designed to learn. If we have M < N events on the 
average in a pattern, then the expected value for the y^(t) correspond- 
ing to events in a pattern is y (t) = i/M after learning has occur'ed, 

i 

The y^(t) for events not in the learned pattern are small, Novr, we 
can specify ^ such that: 

/3 = bM 

where b is a constant necessary to result in a well learned grid 
prediction response for a pattern composed of one event. With this 
P , the input prediction signal to a node rej>rosenting an event in 
the learned pattern is: 

y^(t)^Xjj(t -r) -• (l/M)bl-fac^(t -'c) = bx^(t -r) 

and thus we get vrell 3. earned responses. 

However, if there are fewer than M events in the pattern learned, 

the prediction responses vrill bo larger, 3f there are more tlian M events 

in the pattoivi learned, the prediction responses idll be smaller. 

Because the y.(t) do not cliange once the pattern is learned, there is 
a 

no possibility of changiiig this situation. 

Thus the well learning criteria is an unrealistic requirement 

for a vii’ttial lateraly inhibiting outstar that is confronted with the 

possibility of learning a vri.de variety of patterns, The,wel3. learning 

criteria v:as oilgina3.1y introduced because we adopted the coxivention 
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of reading the aaiplitndos of the x processes at the nodes as the 
response of a nodet As the measvireiiient of very small or very largo 
amplitudes vias jmproctical, the well learning criteria was adopted as 
a measvr’ement standard. For the virtual laterally inhibiting outstar 
we could devise another virtual process to interpret grid node responses. 
For instance, the probabilities: 



would be suitable, Hox^ever, as the pattern we will teach the out star 
in this experiment is simple and we know that it will be composed 
of at most one event, we can retain the well learning criteria for 
interpretation. In a more general sitmtion the above discussion 
must be considered. 

Since we are going to teach the outstar a. pattern composed of 
at most one event, and v;e are going to specify ft according to the well 
learning criteria, v;e can make a quick estimation of what ft should 



The input prediction signal to the grid nodo^ corresponding to the 
event in the pattern should have a maximm amplitude equivalent to the 
maximm amplitude of an input pulse: 



^(max xjt -'t)) = (A/c^ )(1 - e"'"^) = (A/« )(1 - e"^) = 0.63(A/a ) 




be: 



^y^(t)(raax x^(t -r)) = A 

For one event in the pattern, y^(t) =1,0 after learning, Thero- 



foi’o we want: 



or: 



ft " c^/.063 = 5.28 

Experimentally the appropriate value of ft was found to be: 

/3= 4,77 
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The 11^ error is due to both the naivete of the estimation and the error 
inherent in the digital simulation. 

Having specified ^ , we will specify v to be equal to ^ arbitrar- 
ily: 

V = /3 = 4.77 

Only u remains to be specified. Since it is claimed that a virtual 
laterally inhibiting outstar can use the large z*s resulting from a 
small u to absorb random mistakes, we will specify u to be small, 

u = 0,01 sec,”^ 

Note again that a small u means that the decay time of the z process 
l/u is large compai’ed to the presentation and/or prediction interval 
to be used in the experiment. 



section 6,3 Resrilts of the Experiments with a Virbus-l Ijaterally 
Inhibiting Outstar 

Figure 6,3.1 shows the results of presenting the pattern V 

® 2 

to the vii'tual laterally inhibiting ouLstar tid-ce and then asking for 

a prediction of the pattern. As can be seen from the trace, 

V V was well learned, V — was learned slightly due to the 
c 2 ^3 

prediction signal's "tail", (Event 3 presented with presentation 
phase = +28 with respect to the prediction signal,) Also note 
that x^(t) responds to prediction slightly although event 1 has not 
been presented to the outstar, 

Looking at the y.(t) traces in figure 6,3,1 we can see why. All 

three y (t) started VTith the same initial values y.. (t) = 0,3333 for 
i 

i = 1, 2, 3» The first presentation of the pattern resulted in 

rising to a maximum value of nearly 0,8 while y^(t ) and y^(t) decreased 

to about 0,1 each, ’.flien event 3 "was presented 2S after event!, the 

y^(t) changed slightly due to correlation between the prediction signal's 

tail and x^(t). Note that on the second presentation of the pattern, 

y (t) decreased again and y (t) increased. According to the outstar 

1 2 

theoi’em, more presentations of the pattern because of correlation between 

the tail of the prediction signal and x^(t). However, in. the tv:o 

presentations in figure 6,3.1 y (t) is stil.l large enough to allow some 

i 

prediction signal through to excite x^(t). 

If we remember that it vra.s agreed to interpret an equal response 
from each of the grid nodes as no responso, then vre can place imaginary 
thresholds, V , on the x (t) traces, TVr shorn in way of the third 
response on the x^(t) trace was chosen such that if y^(t) « 0,3333 for 
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Figure 6.3#1* Results of teaching a vlrt\ial laterally inhibiting 
out star a pattern# 



i = 1, 2, 3 in the outstar, alD. grid node prediction responses V7onld be 

subthreshold. Thus, by interpreting a node as not responding until j 

it is suprathreshold , we can interpret the resvilts in figure 6,3.1 

as saying that only V — s- V learned by the outstar. The results 

c ^ 

of performing an experiment on a virtual laterally inhibiting outstar 
with real, versus imaginary, thresholds \;ill be report. cd later in this 
chapter. 

Of interest is the fact that the y (t) did not change during the 

i 

prediction. This T-ras an outstar theorem guarantee which is now 
experimentally verified. 

Figure 6,3.2 shows the results of continuing the experiment, A 

simulated random mistake was presented with the pattern by presenting 

event 1 at the same time as event 2 was presented. Note that on 

subsequent predictions, x (t) remained subthreshold. It can be concluded 

1 

that this vir.'tual laterally inhibiting outstar is r*esistant to random" 

mistakes. However, looking at the y,(t) traces, it can be seen that the 

random mistake did reduce this effect persisted through 

subsequent predictions, Thvis, even though the prediction responses 

of X (t) are subthreshold, the y.(t) remember the mistake. It viill take 

1 ^ 

several presentations of the correct pattern to undo the effect of the 

random mistake. In the discussion of using large amplitude z processes 

to absorb mistakes in section 3 ,^ it ms sho^-m that the z processes 

would reflect the conditional probabilities PR , , Up to the end of 

i/c 

the experiment in figure 6,3*2, the c event has been presented 6 times. 
Event 2 has been presented 3 times and event 1 has been presented 1 time. 
Using the past history of the occurance of the events in the environment 
to estimate the conditional probabilities pR , and pR , we gets 



A=IO 
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pR, , = 1/6 = 0.1666 

1/c 

I®2/o = 3/6 = 0-5 

The ratio / P^2/c ” ®’^333t -^t the end of the experiment in 

figxire 6.3.2, y.(t) = 0.15 and ~ 0.6666. The ratio y (t)/ y (t) 

X c 12 

is: 



y^(t) / y^Ct) = (0.15)/(0.6666) = 0.225 

As the y.(t) are directly proportional to the z .(t), the above 
i 

calculations show that the virtwal laterally inhibiting oatstar is 

more resistant to random mistakes than worild be expected if it were just 

using large amplitude z processes to absorb mistakes. On the other hand, 

we can show expect the large z's to reflect the statistics of the 

environment some what and the Inhibitory mechanism of the outstax' is 

not sufficient to completely overcome this. Thus some effect on the 

y^(t)'s must be expected from the statistics of the environment. 

Figure 6.3.3 shox'xs the results of continuing the experiment and 

trying to correct the learned pattern >jith the patteim V — b- 

by presenting event 1 three times in a row. As can be seen, the 

correction attempt was not successful. Looking at the z (t) traces, 

ci 

it can be seen that the past accurfmlated expei'ience of V — e- in the 

c 

lax'ge z^^(t) is so great that although the accumulated expei'ience of 

V — in z^^(t) is increasing, it xdll require many more presentations 
c 

of Y — V to say that the outstar has corrected the mistake. This was 
cl 

a phenomena noticed in the slow].y forgetting simple outstar also. 

Even though this outstar does latci’aUy inhibit, it is not surprising 
that a large amount of experience x-rf.th a pattern vrill make it difficult 
to convince the outstar that the pattern is a mistake. In ox’der to 
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Figure 6 . 3 . 3 . Attempt to correct a previously learned pattern In 
a virtual laterally Inhibiting outstar. The previously learned pattern 
Y is being corrected with the pattern 
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improve the virtual laterally inhibiting outstar's correctability, 
the forgettDJig rate u VTill have to be decreased. 
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section 6,4 A Virtual laterally Inhibiting Outstar with Thresholds 
and an Intermediate Forgetting Rato Designed to Learn j 
Patterns of More than One Event 

In the previous section f it was concluded that the addition of 
thresholds to a virtual laterally inhibiting outstar would be an aide 
to the interpretation of responses. It 'txas also concluded that a faster 
forgetting rate would increase correctability. In this section ,we will 
test these conclusions. Additionally, it woud-d be instructive to see 
what happens when the pattern being taught to the outstar is composed 
of more than one event. 

In order to have stifficient possibilities available to study 
teaching an outstar a pattern coraposed of more than one event, the nvmiber 
of grid nodes, N, will be increased to N = 5» x-rill specify |3 to 
result in a x^ell learned response for patteivis composed of an average 
M = 2,5 events. The input pulse parameters; the x process rise rate, «• , 
and the transmission delay, t , will be kept the sajnc as in section 
6,2, The folloxTing parameters are therefore specified; 

Rectangularly shaped inpu.t pulses: 

A = 10 

S = 0,3 sec, 

-i w. 
oL = 3*3333 sec, = 1/S 

T = 0,3 sec. 

Since thresholds are to be added to the outstar, the equations 
governing its perfoimiance xoil have to be cha,nged; 



6A,1 X (t) = -OCX (t) + P (t) 
c c c 

6,4-,2 x.(t) = - ax (t) -!■ P.(t) *!- y.(t)/3fx (t ~V ) - T 1 

X X 1 xl^c 

6.4.3 z^^(t) = + V [x^(t -r) - [x^(t) - 

6.4.4 y (t) - z .(t)[ 2 z (t)]"' 

Now vie are faced viith the problem of assigning values to the 

thresholds V and X' • section 3*5 it was concluded that putting 

c X 

thresholds on the grid node x processes of a simple outstar was in- 
advisable because this vio\ild result in eventtial extinction of all memory. 
This vias due to the fact that the e, processes decayed exponentially 
at the rate u. It vias quite possible for the z's to decay vintil the 

predictions input signal p z ,(t) [x(t- )-T’])‘ to the grid 

• cx c c 

nodes is tinable to drive the grid node x process suprathreshold. In 

this situation the outstar co\ild no longer “pump up" the z process 

because the correlating signal v[x (t -'t) - T] [x(t)-T* J 

® c-* X X 

viould be zero. However, in the virtual laterally inliibiting outstar, 

vie do not have this problem. The prediction signal amplification processes 

are the y.(t) which do not decay. Thus we may specify a non zero T 
i X 

in equation 6.4.3. 

In fact, use of a grid node threshold is advantageovis in a viii-ual 
laterally inhibiting outstar. Beside the intoi'prctive advantage dis- 
cussed in section 6.3, there is a real improvement of perfomnancee 
Since the convention for interpreting the responses of a virtual laterally 
inhibiting outstar says that equal responses by all the nodes in the 
grid is a sta.te of total ignorance, vie have specified equal initial 

conditions on the y (t)'s. That is, y (t) = (l/N) for al3. i. Now 

i i 

suppose that vie liave have a virtual laterally inhibiting outstar in 
a state of total ignorance. This means that we have not presented 



/3F 



an intelligiblo patteni of grid events trith the command event. However 

it does not mean that the command event alone has not been presented 

to the outstar. In fact, until we decide to teach the outstar that the 

command event is associated with an intelligible pattern, we may excite 

the command node as many times as vre like. Because the prediction 

signal so generated is being split up evenly between the grid nodes, the 

y.(t) vdll not deviate from a state indicative of total ignorance, 

Hovrever, the correlating signal vx (t - t')x_. (t) vrill become positive 

c ■ ^ 

on each such ignorant prediction and the z_.(t) iTill grow. We had 

OIL 

great difficulty connecting a learned mistake in section 6,3 because 
the esperience vjith the erroneous pattern tos great. If the out star 
is allovred to accuinxilate experience v/ith the ignoi’ant pattern by spurious 
exciteiaents of the coaimand node, then it vrall be equally difficult 
to coi’rect the ignorant pattern Trith an intelligible one. 

Of course, increasing the forgetting rate should partiallj'- 
alleviate this problem. However, it would be better to prevent the 
outstar from accumulating experience with the ignorant pattern altogether, 
A properly selected grid node threshold would achieve this result. 

In the state of initia,! ignorance, the amplitxide of prediction signal 
inputs to the grid nodes is: 



(A/N)x(t-r) 

c 

as y (t) = 1 /N for all i. Suppose A has been specified to result in 
i ' 

a well learned response for an a.verage of M < N events to a pattern. 
Then the ignorant state inpxxt prediction signal is: 

(bM)/(N) X (t -r) 



c 

where b is a. constant which resxilts in a well learned response from a 
grid node when bx^(t - T ) is the prediction input signal. Now a irell 
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learned prediction response is one in which the roaxiima ©riplitude 

! 

of the response is equal to the naximum amplitude of a response elicited 
by an event input pulse alone. Knowing the shape, amplitude, and 
dtu’ation of the input pulses, the maximum *ampD.itude of a well learned 
response can be analytically calculated. For the input pulses of this 
experiment, it is: 

X = max amplitude of well learned response = (A/ o' )(j. -e~^) = 

0.63(A/a ) 

Thus the proper F to prevent accumulation of experience x-rith the 
ignorant pattern may be analytically specified by: 

= max amplitude of predi.ction of the ignorant patter-n response = 

(m/n)(o .63 a/ a ) 

Knoxiing that M = 2.5, N = 5, A = 10, « = 3.333; 

r = 0.945 

X 

Note that this T will X'Tork only for the input pulses specified, 

DC 

Outstars are capable of learning patterns independent of the vigor 

xirhich with they are presented. They are also capable of learning patterns 

composed of events presented at different strengths. Of course, in a 

threshold out star, there is a miniBium pulse amplitude A which xdll 

resxilt in suporthreshold responses and thus learning. In this study 

it was decided to maintain the specifications on the input pxilses 

constant because a large number of outs'tars are being studied, A 

detailed study of varying the input pulse specifications in each outstar 

requires a prohibitive amoxmt of time. In an outstar functioning 

in an environment .in which events occua* with varied smplitxxdoG, a 

statistically average xrell learned rosponso could be used to specify 

a V sufficient to prevent accxxtnxfLation of experience xd.th the 
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ignorant pattern on the average. However, this is not a study that trill 
be undertaken in this paper. In this study X're. are able to completely 
knoxT ahead of time the exact specifications of our input pulses and are 
consequently able to specify the parrvmeters of the out stars to resxilt 
in the prefoimance we xrant. 

Unfortunately, tho above analytic method was not completely 
understood at the time the experiment being reported was performed. 

T* = 0,45 was used and consequently the outstar was able to accumulate 
experience with the ignorant pattern. Rather than re-perform tho 
experiment rriith the "correct" T , it was decided to present the data 

X 

collected x-rith the "vrrong" f* , It illustrates the problem of accxxm- 
xfLating experience with the ignorant pattern. Additionally, examination 
of the data will reveal that there are other properties associa.ted vrith 
any non zero which are of more consequence than the property of 
preventing accumxilation of experience xn-th the ignorant pattern, ~ 

It xjas decided to specify the cormiand node threshold, V , such 
that there woxxld be no correlation vrith events presented vrith presenta- 
tion phase <p greater than = S = 0,3 seconds. From prexrlous experi- 
mental data, r - 1,0 vrill satisfy this criteria, 
c 

Addition of a non zero T made the analytical specification of 

c 

too difficxjlt. Thus a ^ resxrlting in a vrell learned response 
for a pattern composed of M = 2,5 events was experimentally determined. 
The value so determined was; 

(3 = 27,9 

u was increased to test the conclusion that a faster forgetting 

rate vrould result in improved correctability. The interval betvreen 

presentations and/or predictions is 1,8 seconds x/nich is the same 
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as in previous experiments. Part of the reason for introducing the 
virtual 3.aterally inhibiting outstar \<ia.s to use the accumulation of 
experience ulth a snail u to aide in resisting random mistakes by 
absorption. Therefore we vrill not make u so small as to completely 
destroy this effect, A decay time of twice the iiterval between 
successive predictions and/or presentations i-ras selected! 

u = 0,278 sec, ^ = l/( 2x1.8 sec.) 

V x-?as arbitrarily specified to be v = 10, 

Since a pattern composed of M = 2.5 events is impossible, it was 
decided to teach the ou.tstar a pattern composed of 3 events and then 
test its random mistake resistance. An additional event presented xrith 
presentation phase <p = *!*2 S =0.6 seconds was included vrith this pattern 
to illustrate the effect of the command node threshold. After this part 
of the experiment it was decided to attempt correction of the pattern 
xrith a pattern composed of M = 2 events. It xias decided to make the 
correcting pattern to consist of an event not included in the original 
pattern and an event tliat vias included in the origiiial pattern. The 
res-son for this se3.ection of correcting events was to sea if there is 
any difficulty in learning that only paH of a previously learned pattern 
is in error. 

Before beginning to teach the outstar an intelligible pattern, 

a prediction of the ignorant pattern xns gotten by excitement of the 

command node alone. This xjas initially done to demonstrate that a 

properly selected V ^ would prevent accumulation of experience with the 

ignorant pattern. Because of the error in specifying T , it serves 

as a demonstration that accuriulation of experience xrith the ignorant 

pattern is a factor to bo considered, 
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The foregoing discussion is suiiiraarized in the box beloTr: 
Equations governing performance of the out star: 

X (t) = - 0.x (t) + 

i°(t) = - ax^(t) +pyi(t) - f ) -J' ^ 

i (t) = -u^ ,(t) + v[x^(t -T ) - rj U^(t) - 1 

where ; 

C y f or y > 0 

ty]^ =1 

[ 0 for y - 0 

Input parameters: 

pulse shape is rectangular 

A = 10 

S = 0,3 seconds 

Network parameters: 

«.= 3.3333 sec. =1/5 
(3= 27.9 

'i' = 0.3 seconds 

r^=i.o 

0.45 

« = 0.278 sec."^ = l/( 2 X 1.8 seo.) 

V = 10 

Initial conditions: 

X (0) = 0 
c 

X (0) = 0 for all i 
i 

z .(0) = 0.1 for all i 

C3- 

and: y.(0) ~ 0,2 for all i 

^ ;<f0 
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section 6,5 



An Experiment with a Virtnal Laterally Inhibiting 
Outstar x-rith Tliresholds and an Intenuediato 
Forgetting Rate Designed to Learn Patterns of More 
than One Event 



Figure 6,5.1 shows the first phase of the experiment described 

in the previous section. The first response on the five grid node x 

process traces is a px*ediction of the ignorant pattera elicited by 

excitement of the command node alone. The z trace for alii, five z (t) 

ci 

shows the experience accumulated by this prediction, Althoiigh increase 

in amplitude of the z processes due to this single prediction is small, 

many such predictions would result in an accumulation. Even this small 

accumulation of experience trith the ignorant pattern affects the 

perfoimiance of the outstar when the pattern V -*• (V. , V , V ) is pre- 

c 1 ^ r 

sented to the outstar as is shown by the y (t) traces. One presentation 



of. the pattern is insufficient to result in convergence of the y,(t) 

to values describing the pattern and a second presentation is required. 

Even though the grid node threshold "T is too small to prevent 

X 

accmulation of experience w^th the ignox'ant pattern it does improve 

the learning perfoitiiance of the outstar. Looking at the x^(t) trace 

it can bo seen that the first presentation of the pattern resulted in 

a redistribution of the values for the y (t). This redistribution 

i 

was sufficient to prevent x^(t) from going suprathreshold long enough 
to add any appreciable amplitxide to z^^(t) on the second presentation 
of the pattern, Dae to the reasonably rapid forgettiiig rate u, z (t) 
cojitinued its decay dxxring the second presentation, V/ith y^(t) so 
small that x^(t) can not be driven suprathreshold, future presentations 
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Figure 6.5*1» Result of teaching a virtual laterally inhibiting 



and/or predict 3.ons will resxilt in no further increases in the aiaplitme 
of z (t)t This T-7ould be of p^.rticuLar ajnpo3rtance if in the first 

c5 

two presentations of the pattern y (t), y (t), and y (t) had not 

12 3 

converged so closely to the final values describing the pattern of 
y^(t) = 0.3333 for i = 1, 2, 3. For, if the y^(t) wore not so close 
to their final values, then the prediction of the learned pattern would 
have resulted in furthur convergence of the y.(t)*s to this final 
value. The prediction of the learned pattern shox-m in the fourth 
response of the grid node x processes shows vrhy. The prediction 
response for the nodes , V^, and included in the p3.ttern are all 
suprathreshold and result in an increase in amplitude for the corres- 
ponding z (t)'s. The prediction response for the nodes V, and 
ci ^ 

not included in the pattern are snbthreshold and therefore do not 

result in increases in the amplitudes of z , (t) and z (t). Thus 

o'+ c5 

the y (t) continue to converge during predictions. HoX'rever, the y (t) 
a 1 

converged so close to their final values in the two presentations of 

the pattern shox-m, that this effect can not be seen in figu.re 6,5.1 = 

A higher resolution look at the yj^(t) shoxj€Ki thnt y^^Ct), 

y^(t) increased from 0,3096 to 0.3225 on this prediction. This 

phenoraena is not in contradiction to the ovxtstar theorem x-jhich gaiai'an- 

tees only that the y (t) x-7ill not diverge during a prediction. Con- 

i 

vergence is therefore theoretically peimiissible and grid node thresholds 
resxfLt in convergence during predictions. 

In figune 6,5.1, event h was presented 25 = 0,6 seconds after 
events 1, 2, and 3 i-n the pattern. The command node threshold 
was chosen to prevent any correlation xrith eve^xts presented wore than 
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0,3 seconds after arrival of the prediction signal at the arrox7heads< 
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(t) does not go stiprathreshold on the following prediction response. 



The fact that y^,(t) and z^^(t) are identica!]. to y^(t) and z^^(t) shov;s 
that the command node threshold v/as successful. Presentation of 
event 4 resulted in a correlation equival.ent to no presentation at 
all. 

As can be seen, the [3 selected resulted in learned prediction 

responses for the three events in the pattern of approximately the 

same amplitudes as the response elicited by an input pulse alone, 

(Compare the maximum amplitudes of the prediction responses of Xj^Ct), 

x-(t), and x_(t) with the maximum amplitude of x (t).) 

^ j c 

Figure 6,5*2 shows the continuation of the experiraont , The pattern 

V — ► (V. , V , Vo) is presented with a simulated random mistake. Event 
c 12^ 

5 is this mistake. As can be seen the presentation of the random mistake 
resulted in a healthy increase in z ^(t). However, this was insuffic- 

c5 

ient to drive y,-(t) large enough to result in a suprathreshold x^(t) 

on prediction. Therefore z^^(t) continues to decay on subsequent 

predictions and is bound foi’ extinction, A slight decrease in y^(t) 

can be seen during the prediction response in figui'e 6,5.2, This is 

due to the prediction convergence phenomena described above. Thus we 

can conclude that more predictions will resixlt in the converging 

back to the values they had before the occviranco _of the random mistake. 

Figure 6,5,3 shows the results of continuing the experiment. The 

previously learned pattern V — (V. , V , V ).is corrected by the pattern 

c 123 

V -®- difficulty with this correcting pattern is that 

event 1 is included in both the original pattern and the correcting 
pattern. As can be seen, it only required four presentations of the 
correcting pattern to result in siibthreshold ^ 2 ^^^ x^Ct) responses. 
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pattern, but are not included in the nev pattern. Additionally, 

y (t) and y^Ct) have decreased in those four presentations to the point 
2 ^ . 

where it can be safely concluded that the dominant patteiai is 
This situation should be compared to the vmsuccessful attempt to correct 
a pattern by three presentations of the correcting pattern in the 
virtual laterally inhibiting outstar vuth a slow forgetting rate sho^im 
in figvire 6e3»3o It can be conclxided that increasing the forgetting 
rate does improve the correctability of -a virtual laterally inhibiting 
outstar. 

The final values for the y^(t)'s to describe the correcting pattern 

are! 

y^(t) == y2j,(t) = 0.5 
y^(t) = y^(t) = y^(t) = 0 

As can be seen, y^(t) has slightly overshot its final value and 
y^l^(t) has only reached a value of yj^(t) ” 0.38. Hox^ever y^(t) and y^(t) 
are converging toTrra.rd each other. Me may conclxide that the previously 
accunulated experience vrLth event 1, which is common to both patterns, 
is great enough to make convergence to the ne'ii pattern difficult. 

It sho-uld be noticed that the prediction responses of (t) and 
X, (t) at the end of the experiment are both of greater amiplitude than 
a response to an input pulse alone. This is an effect of lateral in- 
hibition, In the old pattern of M = 3 events, the prediction response 
amplitudes of grid nodes associated vrith the pattern was slightly less than 
the amplitude of a response to an input pulse alone, p had been speci- 
fied to result in a well len.mod response for a pattom consisting on 
the average of M = 2.5 events. Thus the 3 event pattern results in smaller 

than *weDl learned grid node responses and the 2 event pattern resxilts 
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in larger than well learned grid node responses. 




CE4PTER 7 OTHER FORI-IULATIONS FOR THE z PROCESS 
section ?,1 Introdxiction 



In the discussion of the laterally inhibiting outstar it was 
inemtioned that the outstar was excitory biased. The equation for the 
z processes in the laterally inhibiting ovitstar vjas; 

6,1,1 z (t) = -uz .(t) t v[x (t - 'C')x. (t)3 
where: 



+ 

dy3 



y if y > 
0 if y i: 



0 

0 



By excitory biasing, it vras meant that the learning z processes could only 
assume non negative values. Thus the input prediction signal to a 
grid node, fiz .(t)x (t -T ) is always non negative and can not drive 

• Cl c 

the grid node’s x process to negative amplitudes. In this -way, the z 
processes are biased against learning to inhibit grid nodes and are 
biased in favor of learning to excite them, ~ 

In this chapter we shall drop the excitory biasing restriction and 
conduct an investigation to see if there is any value in outstars 
vrhich can learn to inhibit grid nodes as well as excite them by pre- 
diction signals from the command node. One reason for conducting this 
study is that in the laterally inhibiting outstar we had to introduce 
a nev; element sn the embedding field network elements. The inhibitory 
directed edges' arroi'rheads contained z processes which wore assigned the 
permanent value of -1, These z processes did not learn their vdues as do 
the z processes in the other arrox-iheads in the network and we must 
consider a non learning z process to be a new feature. In the avalanche 
using a long axon and collatcra3.s wo avoided the vise of z processes with 
permanent valxies of +1, If we solve the ]mlse lengthening problems of 



the outstar avalanche, then we vrill have to nse another new elenient. 
Development of a general forrmlation for z processes to cover all z 
processes wotild eliminate the need for making exceptions for special 
design feature in a network. We tnll attempt to fomulate more general 
z processes in this chapter. Throughout, we shall be speaking of 
embedding field networks \>7hich do not liave any virtual processes 
associated with them. The netvrorks we shall discuss conform to the 
embedding field elements of chapter one, ' 
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section 7,2 



A Description of the States of the Processes in an 
Out star 



A z process at an arrowhead correlates the prediction signals 

arriving at the arrowhead and the x process at the node upon which the 

arrovjhead impanges; and it reraembers what the correlations in the past 

have been. The z process can therefore bo considered to be a function 

of the past and current states of the adjacent node and the prediction 

signals. The z process itself can be thought of as being in various 

states. For instance, we can think of a large amplitude z process 

as being in an excitory state as it allows largw prediction signals 

through to excite the adjacent node. Small sjnplitude z processes could 

be thought of as being in an unlearned or ignorant state. 

In this chapter v;e shall use this idea that z processes are in 

states which may be completely determiiied by the past histoi'y of the 

states of the prediction signal and the grid node x processes. We sliall 

develop a state function ^ (x , x^) which maps the states of the 

prediction signal x and the grid node x process x. into a z process 

c 1 

state z . : 
ca 



(x , X. ) = z 
c 1 ci 

It vdll be found that this function is a handy way to describe the 
logic behind the learning process in an outstar and for this reason 
we shaDJ. call the state ftmction eC a "logic". However, before the 
usefulness of such a "logic" can be demonstrated, we must build up 
a description of the states of the various processes in an outstar. 

In outstars vrithoxit virtual processes, we are concerned with four 
processes £ 
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1 . 



Inputs, P (t) and P.(t) 

c i j 

2. Node X processes, x (t) and x (t) i 

c i 

3 . The prediction signal from the command node, [x (t -f) “T*3 

c c 

vrhere fj, may be zero 

4. The z processes, z^,^(t) 

Input pulses, P (t) and P (t) have been used to indicate the occurance 
c i 

of events in the envii'onment. There are two possible states for an 
event. Either it is occuring, or it is not, • We have transmitted 
information about whether an event is occuring or not to the outstar 
by the input pulse, A positive amplitude has been used to signify 
that an event is occurring, A zero amplitude has been used to signify 
that an event is not occuring. The follox'riing code can therefore 
describe the state of inputs and the sta,te of the events they describe: 

(a) P = +1 indicates that an event is occuring and that the assoc- 
iated input has a positive amplitude, 

(b) P = 0 indicates that an event is not occux*ing and that the 
associated input has a zero amplitude. 

Node X processes have been used to signify the recent presentation 
of an event and/or a recent prediction of an event. A large positive 
amplitude has been interpreted as indicating that the outstar "thinks" 
that the event represented by the node in question has occured recently 
or at least, should have occured recently. Small positive amplitudes, 
or zero amplitudes have been interpreted as iudicatiaig that the outstar 
is not "thinking" anything about the event represented by a node. 

Negative amplitudes have been interpreted as indicating the same 
state as small or zero amplitudes. 




% 



By placing thresholds on the nodes, we vrere able to precisely 
detonnine vihen an x process was of large enough positive amplitude to 
indicate that the out star is "thinking" an event. With thresholds 
we nay replace the word "large" in the preceding paragraph viith the 
word "suprathreshold" . In the same manner "small", "zero", and 
"negative" may be replaced iri-th "subthreshold". 

Thus we have tiio states for a node x process: 

(1 ) x^ = 1 indicates a state where the x process at a node is of 
stiff iciently large positive amplitude, or is suprathreshold. This state 
corresponds to the interpretation that the outstar is "thinking" about 
the event represented by the node, 

( 2 ) x^ = 0 indicates a state where the x process at a node is of 
sra|ill or zero positive amplitude, or is subthreshold. This state corres- 
ponds to the interpretation that the outstar is not "thinking" about 

the event represented by the nodo. ' — 

Although the notion "thinking" about corresponds to the psychologi- 
cal interpretation of x processes' amplitudes, it is clumsy. In the 

outstar, the only way an x process can get into the state x = 1 is 

i 

to respond to an input. That is, it must respond to excitement by an 
input pul.se or an input prediction signal, or both. Thus we could 
describe the state x. = 1 as "responding" or "excited". To avo.id 
semantic difficulties, the state x. - 1 vrill.be called the "excited" 
state. 

For semantic reasons also, the state x = 0 will not be called 

i 

"not thinking" about. Although "not excited" vrould apply well to 
x^ = 0, it vrill not be used either. Instead the s'tato x\= 0 vrill be 
called "ambient", "Ambient" is used because it refers to a state 
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which is the usual state of an x process. The ambient state = 0 

is also the passive state to which an x process al\jays returns. 

Further, it is the state of an x process when it is not being actively 

driven by signals from outside the node. Thus it was felt that "ambient 

accurately describes the state x. = 0, 

i 

In the above listing of states for x processes, an x process 
responding i/ith a negative amplitude was not included. Although 
we have followed the convention of interpreting negative amplitudes as 
being tho same as ambient amplitudes, the inhibitoiy process that result 
in negative amplitudes is not an asibient process, A negative amplitude 
can be achieved only if the x process is being actively driven in the 
negative direction by signals from outside the node. It is therefore 
definitely not "ambient". There is no reason vjhy our description of 
the states of x processes should have to conform mth our interpretation 
of what those states mean. V/e vTill refer to an x process of negative 
amplitude as being in the inhibited state and indicate this state by 
X. = -1, We will continue to interpret the state Xj^ == -1 as indicating 
the same interpretive state as x^ = 0, 

The difficulty with the inhibited state is that it is a s'abjective 
state within the outstar. In the environment the state of an event can 
be described as actively occuring or pa,ssively not occuring. There is 
no saich thing as an event that actively does not occur. However, we 
saw that a practical simple outstar vxith only the two x process states 
of being excited or being ambient had very little resistance to random 
mistakes, Vie added lateral inhibition to allow the outstar an active 
process vrhereby it could subjectively prevent events from occuring. 
Particularly, lateral inhibition i-ras added to subjectively prevent 



random mistakes from occuring in a previously lea.med pattern. 

Suppose we had a black box that was clanmed to be a learning machine. 
The only way 'i7e could detemine if it vras a learning maoMjie is to 
teach it something and then see if it could reproduce what v:e taught it, 
V7e would only be able to observe the events we vrere teaching it and the 
box's response. Now, the box's response would be events to us. 

Thus from our point of view the only states the box could cormivinica.te 
to us vjould be the state of a response occuring or the state of a 
response not occuring. The state of a response somehow being able to 
not occur with greater vigor than simply not occuring is meaningless. 
Thus, our intoi’pr station of what an out star is doing is limited to what 
v:e covild observe if the outstar were a black box. 

We have used this interpretive convention and vrill continue to do 
so. However, an outstar is not a black box to us. We can observe all 
the processes occuring inside it. Thus we ai'o confronted with the iri- 
hibited x process state which we can observe inside the outstar, but 
trhich is meaningless V7hen observed outside the outstar. Inside the 
outsta.r the inhibited state is meaningffO. and definitely corresponds 
to something other than ambient. Thus V7e have assigned a separate 
state to describe the state of an x process which is being actively 
driven to negative amplitudes by signals from outside the node. 

There is some difficulty in sa,ying when an x process is in the 

inhibited state in an outstar mth thresholds. An x process can be 

actively driven subthroshold by inhibitory processes and still have 

a non negative amplitude. For simplicity this situation will be 

^ considered t'o be ambient. The inhibitory state is therefore only 

the state in which an x process has a nogat5.ve c»j-.iplitudo. In case of 
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a negative amplitude > there is no confusion about the x process at a 
node being actively driven tox-iard negative values by signals from 
outside the node. 

In summary, the states of an x process at a node are; 

(1) The excited state, = +1, The amplitude of the x process is 
large or suprathreshold. 

(2) The ambient state, x^ = 0. The araplitude of the x process 
is small, zero, or subthreshold, 

(3) The inhibited state, " “!• The amplitude of the x process 
is negative, 

A prediction signal at an arrovrhead. is the originating node's method 

of influenc'jjig the other nodes in the network. In order to define o\ar 

logic , X.), we viill have to assign states to prediction signals 

c a 

at an arrowhead, V7e could assign the same states to prediction signals as 

I 

we have assigned to x processes. This woTold mean that the prediction 
signal is conveying the state of its originating node to the arrowhead. 
However, prediction signals do more than convey the state of the origin- 
ating node to the arrowheads. They also influence the state of the 
X process at the nods upon i-;hich the arrovihead impinges. There is no 
diffictilty in allovring a prediction signal to have a large or supra- 
throshold araplitude and describing this state as the excited state with 

state value x = +1, However, the other states we may allow a pi*ediction 
c 

signal to be in require some discussion. 

First, consider the case of a prediction signal coming from a 
node vrith a threshold on it. In the past we have used both "real" 
thresholds and "imagiriary" thresholds. The imaginary thresholds vrei’c 
placed on a node for precision in interpreting when the node was 
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responding. The 'real' thresholds were placed on a node to prevent the 

z processes from learning spvirioits associations when the x process ^ 

was of small amplitude. In the case of the command node, thresholds 

were used to prevent the cominand prediction signal from causing 

spurious associations from being learned when it tra.s of small amplitude. 

This vras accomplished by restricting the command prediction signal 

to be zero until it was suprathreshold, i.e. CxCt-'Z')- T1 . 

c c 

In this case, also prevented thr prediction signal from influencing 
the state of the grid node upon vrhich it was impinging until. it was 

X 

suprathreshold. This was accomplished by making the input prediction 
signal to the grid nod.e to be ftz . (t) T x (t - ) - T* 3 , 

There is a reason behind this. Suppose we have an outstar grid which 
is shared by many command nodes repi'esenting separate and distinct 
comraand events. In the environment, a distinct pattern of grid events 
usually occiu’s vri.th each of the cojrmiand events. If the outstar is 
to function properly, it must be able to learn that a certain command 
event, c-^, is associated only with the pattern, 0 ^,.vrhich occurs with, 
it in the environment. It must be prevented from learning that the 
patterns occuring irlth the other command nodes in the environment are 
associated irith c^. 

A subthreshold command node x process only occurs when the command 
event has not occured recently in the envirormient . Thus \-je can expect 
that a pattern not corresponding to this coriiraand event is on the grid 
when the command node is snbthreshold. By making the prediction signal 
coming from a subthreshold command x process identically zero, we prevent 
the outstar from building up a vjrong association. Additionally, by 
making the prediction signal identically zero, we prevent it from 
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exciting the grid nodes which are included- in the pattern associated 
with this particular coraraand node. This is important. Consider two 
command nodes V and V which represent events c and c^ which occur 

cl c2 12 

in the environruent with patterns and 1‘espectively, Suppose 

x^^(t) is subthreshold and is supra threshold. Then we can expect 

that the grid node x processes indicate that the pattern is on the 

grid. We have already agreed to make the prediction signal [xci(t-r) - 

T* 1 identically zero to prevent V — «* Q from being learned. Suppose 
c ^1 2 

however that vje allow the prediction input signal from to the grid 

nodes representing 0^ to become excited. The pattern on the grid woudd 

therefore be the algebraic sum © + © , The prediction signal 

1 2 

coming from the sxxprathreshold node '‘•Jdll therefore cause the 

association ^^2^" learned. To prevent this possibil- 

ity we have made the prediction signal input from a subthreshold command 
node identically zero. 

Thus the prediction signal from a subthreshold command node is 
identically zero. We may as x<rell drop the fiction of assuming that a 
prediction signal ms sent from the command node in the first place 
and say that a prediction signal is sent out along the directed edges 
only if the x process at the originating node is supra threshold. 

We also used "real" thresholds interpretively. We now have the 
case that a subthreshold x process at a node' is interpreted as no response. 
Further, it is unable to influence other nodes in the network because 
no prediction signal is sent from this node. Thus a certain araount of 

consistency is added to our interpretation of the amplitudes of the x 

\ 

processes. An x process which indicates no response also has no effect 

on the other nodes and processes in the network. If we vrero unable 
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to measure the amplitude of an x process at its node, wo would have 
no v/ay of knowing what amplitude it had as long as it was subthreshold, 
From the point of view of an exteivial observer or any of the other 
processes in the outstar, a subthreshold x process is indoed ambient. 

Thus vre have an "ambient" state for prediction signals at an 

arrovrhead. It is indicated by a zero amplitxide and is assigned the 

state value x_ = 0. It must be remembei*ed that this state arises from 
c 

an originating node that was subtlireshold T time units before. 

In the case of an outstar without thresholds, we lose the pre- 
cision in defining vjhen a prediction signal is ambient. We vrill 
therefore describe a small amplitude on a prediction signal to be 
ambient. As previously, "small" ^Jill mean small relative to the 
maximum amplitude of a v/ell learned response. 

Having made the prediction signal coming from a subthreshold 
X process identically zero, it would be silly to allow prediction signal 
coming from an inhibited x process to be non zero. In this study we 
will not consider prediction signals of negative amplitude. Part of 
the reason is that allox^ing an inhibited x process to send out pre- 
diction signals would violate the consistency we have just developed. 

An X process state which is interpreted as no response should not be 
able to influence the other processes and nodes in the network. 

Another reason is that p>rediction signals of negative amplitude are 
not required. We have seen that the negative amplitude of inhibitory 
input prediction signals in lateral inhibition can be accounted for 
by allowing z processes with negative" values. In fact, lateral inhibi- 
tion has been the only case in vihich V7C have used inhibition. The 

T-rholo function of lateral inhibition was for an excited grid node x 

IS1 



process to inhibit the other nodes in the grid. Thus the emission 
of inhibitorj’’ prediction signals from a node vras only useful when 
that node w^as in the excited state. 

In surmnary, the states of a prediction signal at an arrowhead 

are: 

(1) The excited state, "x = il. The amplitude of the prediction 

' c 

signal at the arrowhead is large and positive. This I'esults from a 
large or suprathreshold x process at the originating node X time units 
previously, 

(2) The ambient state, x_ = 0, The amplitude of the prediction 

« I- I 0 ^ 

signal at the arrovrhead is small or zero. This results from a small, 
zero, subthreshold, or negative x process at the originating node X time 
units previously, 

Vife will assign the following states to a z process based upon its 
amplitude : 

(1) The excitory state, z^^^ " +1. A z process is in this state 
when its amplitude is large and positive. 

(2) The ambient state, z^^ =0, A z process is in this state when 
its amplitude is si;iall or zero, 

(3) The inhibitory state, z . = -1, A z process is in this state 

Cl 

vrhen its amplitude is negative. 

The states for z processes at an arrowhead vxere a.ssigned according 
to what effect a prediction signal modified by the z process wouJld 
have on the node upon which tho arrovjhead impinged. Clearly, a z 
process id.th a large positive amplitude v:ould result in prediction 
excitement of the impinged upon node. A z process with a negative 
amplitude wo\ild result in prediction inhibition of the node, A z 
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process vrith a small or zero amplitxide would result in very little 
disturbance of the impinged upon node. The ambient state for a z | 

j 

process is also the passive state for a z process. With a non zero 
forgetting rate, it is the state to which a z process passively returns, 
and it is the state tjhich a z process ass^3mes when it has not been 
perturbed by signals from outside the airrowhead. 

Up to now, the only way a z process could assume the inhibitory 
state z was by permanent assignment of a negative value to the z process. 
In wliat follovrs, we will consider new formvilations for the equations 
for a z process that will allow a z process to learn to asstime the 
inhibitory state. 



section 7«3 



Logics 

Having described the states of the various processes in an out star, 

vje are nov ready to introduce the fxmction cC (x. , 5E . ) = z . ©C 

Cl ci 

describes how the state of a z process at an arrowhead is determined 
from the states of the prediction signal at the arrowhead and x process 
at the adjacent node. Throughout this discussion the state of a pre- 
diction signal vjill be denoted by x , The state of the adjacent x 
process xTill be denoted by x^, and the state of the z process will be 
denoted by z^^. The choice for the subscripts was motivated by the 
geometry of an out star, but the discussion is not limited to out stars. 

It applies to all not^rorks which may bo built from embedding field 
elements. Throughout, the function ^ vrill bo called a "logic”. We 
will introduce several distinct logics and they viill bo distinguished 

by subscripts, i,e, oC^, 

~ A logic is a tabular function. That is, \jo tabulate all the 
possible combinations of prediction signal states and x processes 
states and assign a z process state to this combination. For example, 
the logic for the excitory biased out stars we dealt vrith previously 

is defined by: 

Definition of the Excitory Biased Logic, 



^c 


^i 


s'! = 
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0 


0 


+1 


0 


0 


0 


+1 


0 


+1 


+1 


+1 


0 


-1 


0 


+1 


■ -1 


0 



(The inhibitory prediction signal state x = -1 lias been excluded 

c 

from consideration for reasons of consistency as explained in section 7,2) 
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The reasons for calling this an excitory logic aro clear. The 
only states allowed for the z process are tho ambient state = 0, 
and the excitoiy state = tl. The ambient state is passive. The 
z process does not actively learn to be in the ambient state. Therefore 
the only state \i7hich the z process can actively leam is the excitory 
state. Thus the z process is biased to. learn orJ.y the excitory state. 

The excitory biased logic, is implemented in an outstar by 

the equation: 

7.3.1 ~ ^ ^ ^ ~ Vy . 

where either or both thresholds can be zero. 

The driving functions in equation 7.3.1 is: 

v(x^(t --r) - T^l'^ [x^(t) - 

This function is always non negative. It can aotive3.y drive tho z 
process only when the prediction signal and the adjacent x process are 
both in the excited state. Additionally, because the driving ftinction 
is always non negative, it can only drive the z process in the direction 
of increasing positive amplitudes. Thus our tabular definition of Xq 
conveniently summarizes the effects of equation 7.3.1 on the outstar. 

Note that Xq only describes the itiEviediate effect of the states 
of the prediction signal and the adjacent x process on the z process. 

It does not describe the current state of a z process based on the 
entire past histoiy of the prediction signal- and x process states. 

That is, Xq only tell us in vjhich direction the z process vrill be 
driven by the signals at a given time. 

IVe shall now consider other logics for z processes, • A general 
approach V70uld be to consider all the possible assignments of z^^ 
states to each of the six distinct combinations of x^ and x. states. 



However, this results in 3 ^ logics. Wo wHl therefore have to use some 
judgeinent in selecting the logics to be considered. 

A key tenet of embedding field theory is that an excited predic- 
tion signal and an excited x process should resxilt in an excitoi’y z 
process. Thus we trill only consider logics in vrhich; 



X^(x = + 1 , X. = + 1 ) = +1 

Also, we have alt'iays started expei'imcnts which the z processes in 
the ambient state. That is, the initial conditions on the z processes 
have always been small or zero. V/e have interpreted those initial 
conditions as a state of initial ignorance. It would be senseless to 
allow a learning machine to develop from initial ignorance to learning 
something by itself. For this reason vre will only consider logics 
in which: 

/i(Xc = 0, x^ = 0) 

If, 

This reduces the possible logics to 3 = 81 , There are no over- 

riding reasons for excluding broad categories of the remaining logics. 
However, 81 logics is just too many to consider. We vnll only consider 
those which show promise in this study. These logics are defined in the 
table below; 

Table 7 . 3.1 
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is the excitory biased logic we have considered provionsly, 
X ^ is the logic resvilting from removing the non negative restriction 
on the driving function in equation 7.3#!: 



7.3.2 



cx 



(t) = ~uz .(t) *!• vx„(t - T )x (t) 



ci 



As the tabulation of X -t shows, if x.(t) is negative, z .(t) will learn 

•^1 Cl 

inhibition. can be considered a neutrally biased logic because the 

z process is not biased in favor of excitation or inhibition. 

is interesting, but of dubious value. Suppose that all the 

z processes in a network are in tho ambient state at the beginning of 

an experiment. That is, the network is in a state of initial ignorance 

at the beginning of an experiment. Then a z process in this network 

can not possibly assume the inhibitory sta.te. The reason is that 

the only states for input pulses are P = *i"l and P " 0, The input 

pulses can only drive x processes in the network can assume are x^ = +1 

and x^ = 0 due to input pulses. Therefore the prediction signals in the 

net work can only assuriie states x = and x = 0. The combination 

c c 

of states X = tl, x, = -1 can not occur’. By the tabrilation of Xi » 

Cl 

the state z.. = -1 can not be attained. 

C j. 

Thus tho logic <C ^ Is effectively equal to the logic X q. If v:e 
allowed the permanent a,ssignment of negative values- to z processes in 
a netw’ork governed by cT^, then it is possible for the learning z processes 
in the network to learn inhibition. However, this requires the arti- 
ficiality of a z process vrith a permanently assigned value, 

X^ defined in table 7*3.1 is particularly interesting in an' 
outstar. As can be seen from the tabulation, z processes in a network 
governed by can learn inhibition from a state of initial ignorance 
vrithout the use of z processes x-rith permanently assigned negative 



values. The two assiguraents: 



X (x = +1, X. = 0) = -1 
30a 

Jt ^(x^ = +1 , X^ = -1 ) “ -1 

insure this. In an outstar, these assignments mean that a coramand 

node can learn to inhibit grid nodes which do not correspond to events 

in the pattern associated with the command event. Consider a command 

event c which usually occurs i-rith the pattern 0 in the environruent , 

Let the grid events { i } be the events which compose this pattern, 

c 

Let the grid events { bo the remaining events represented by- 
grid nodes. Then the assigrunent: 



X„(x -- +1, X = +1) - 

means that the z processes z^^ (t) associated -vrith the grid nodes in- 

CJ-c 

clx\ded in the pattern will learn exci'tation. The assignment: 



X _(x = +1 , X. = 0 ) = -1 

c X 

means that 'the z processes z . (t) associated -with the grid nodes not 

cJq 

in the pattern •tdll learn inhibition. Further, the assignment: 



X„(x = +1, X. = -1) = -1 

J c X 

insures that once these z processes have learned inhibition, they will 
continue to do so. The result is that after having learned tho pat‘tern, 
presentation of the command event alone vnll result in tho grid nodes 
included in the patteivi being excited. The grid nodes not included 
in the patteim vrill be inhibited. If a random mistake occurs in the 
pattern, the learned inhibition will cause it to be supressed. 

We vrill consider an outstar governed by in detail in the next 
chapter. The rest of this chapter will be devoted to an outstar 
governed 
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section 7.4 



Foroiiilation of the z Process Conforming to Logic 






I 



Tho logic defined by tho tabxilation: 

Table 7.4.1 



X 

c 


^i 


XgCXc, X. 


0 


0 


0 


'J-1 


0 


-1 


0 




-1 


+1 


+1 


+1 


0 


-1 


-1 


+1 


-1 


-1 



In this section we sha,ll develop a formrilation for a z process 
that loll conform to this tabrilation. However, we might inquire 
beforehand j.f this is a trorthwhile endeavor. The large number of 
inhibiting assignments makes appear somevihat useless. In the 
discussion of JC^ in the previous section ,we saw that the folloiring 
assignriients in table 7.4.1 are useful: 



X 

c 

0 

+1 

•H 

+1 



X. 

1 

0 

0 

+1 

-1 



X (x^, X.JJ^) = z 



^ci 



0 

-1 

•M 

-1 



We only have to establish the possible usefulness of the other two 
assignments: 

7.4.1 ^ “ 0, X. = il) = “1 

2 c X 

7.4.2 X o(x = 0, X. = 0) = -1 

^ C X 

Assj-gnment 7.4.1 above says that a z process will learn inhibition 
if a grid node is excited and the prediction signal is not. This 
combination can only occur if a pattern not coin'osponding to the command 
event is on tho grid. Thus lea,ming to inhibit this pattern by pre- 
diction when tho command event is presented, is usefxl. Assignment 7.4.2, 
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however, can get us into trouble. Suppose there are tvjo ccinmand nodes 



V and V sharing the sajTie grid. Let the command events c and 



cl 



c2 



Cg represented by these nodes usually occur iJith the distinct patteras 

0 and 0 2 respectively. Let the event represented by grid node 

i^ be an event which is included in pattern but not iiicluded in 

pattern ©o» Then we can expect that excitement of V will result 

in the x processes of the grid nodes assuming the values describing 

0 Additionally, because of assignment node i^ will be 

inhibited. Therefore assignment ?,4,2 i-nl.1 result in z . (t) learning 

inhibitions. If this is learned sufficiently well, subsequent excitement 

of V ^ will result in grid node i being inhibited even though it is 
cl b ^ o o 

-A 

part of the pattern 0^ associated with c^^. 

This vividly illustrates some of the problems vre can get into •^^^ith 
logic 0 ^ 2 « It is not the only one. If it happens that the coimiand 
node or* the grid nodes in an oulstar are I’andomly excited for some tirae 
then flC 2 cause all the z processes In the outstar to learn inhibi- 

tion, Wlien we get around to teaching the outstar the pattern associated 
with the coimnand event, we vrill have to overcome this initial inhibitory 
biasing, 3'n a real environraent , thds will probably be the case. Our 
outstar viHl bo "bom" with all of its z processes in the ambient state. 
It >rill then spend a period of time in the environment before "going to 
school". In this period the random occurance of the coimnand event 
and grid events is highly xmlikely, Theiefore, when the outstar "goes 
to school" all of its z processes will probably be inhibitory biased. 

In order to -prevent this inhibitory biasing from destroying the 
outstar’ s ability to learn when it goes to school, vre vjill limit 

its effect. That is, \Jo wiUJ. lijnit the maximvim negative amplitude 
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of a z process to a valtie that will insure that positive associations 
can not be completely inhibited. This rather vague statement will 
become clearer as we progress in the study of an outstar governed 



by 



A formvilation for the z processes in an outstar that conforms 
to tfg is: 

7.4.1 z .(t) = -uz^^(t) + V (a(xg(t -r ) + x^(t))^ - b(x^(t - t' ) - 

x^(t»2) 

with b > a; a > 0 

Exj».nding the right hand side of equation 7*4,1 we get: 

7.4.2 z ,(t) = “uz (t) + V ( -(b - a)x ^(t - t) - (b - a)x.^(t) + 

Cl ci ' 1 

2(a *5- b)x^(t - r ) x^(t) ) 



with b > a{ a > 0 

From equation 7t4.2 it can be seen that this formulation conforms to X 
It is interesting exactly how this formulation came about. In - 
the progress of the experimental study for this thesis report , the 
author began thinking of simulating an outstar on an analogue computer. 
At that time the idea of logics had not been thought of. The author 
was interested only in simulating an oxcitory biased outstar on an 
analogue computer. To do this the z process driving fruiction: 



vx^(t -r)x^(t) 

had to bo simulated. The product of two varying signals is Impleiijented 
on an analogue computer by means of square law devices. For example, 
the product xy is implemented by forming the sums: 

X + y and x - y. 

These suras a,re then scaled by constant factors a and b. Each sum is 

sent through a separate square law device and then the difference is 
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fomeds 



a(x + y)^ - b(x - y)^ 
expanded I this is: 

(a - b)x^ ■!' (a -■ b)y^ + 2(a + b)xy 
Thus, by selecting the scaled factors a and b such that: 
a = b 

the result of this process is: 

2(a + b)xy 

Scaling this by l/(2(a +b)) results in the desired product. 

It was recognized that an outstar so simulated with a ^ b woxild 

have some of the desirable properties of eC^* A digital simulation 

of an outstar with the formtilation ?,4,1 for the z processes was run. 

The res\£Lts were confusing and in an attempt to clearly define the 

properties of this outstar the idea of logics and processes* states 

vras conceived. Having developed this concept, it was realized that it 

was a handy description of the possibilities for formulating other z 

processes. Additionally, it was a convenient method of predicting x-rhat 

an outstar with various z process formulations would do. 

The z process formulation given in equations 7,4,1 or 7*4,2 has 

some interesting properties other than those described by the tabulation 

of cCg* Tbe z process driving fvinction in equation 7*4,1 is: 

D(t) = V (a(x (t -r) x.(t))^ - b(x (t -r) - X. (t))^) 

This is composed of two competing processes. The process driving the 

z process in the direction of an excited state is a(x (t -t) + x.(t))'^, 

^ i 

Competing with it is the process “b(x (t - T ) - x.(t))'^ which drives 

C 1 

the z process in the direction of an inhibited state. 
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Of particular concern to us is the point vrhere these competing 
driving ftmctions e:caetly balance one another. This point is achieved 
when! 

a(x (t - T) + X. (t) = b(x (t - T) - x.(t) 
c X c 1 

Let j-\. be the ratio of the amplitude of a prediction signal at an arrow- 
head and the adjacent node x process, i.e,! 
x.(t) 
x^(t - r) 

then: 



/<Xi - X. 



b/a 



or: 



1 



V /.- 1 / 

since b/a > 1 > 0: 



= b/s 



/A-1- 1 
/^- 1 




which is a real value. 



Using the positive sqtiare root, vre get: 




Using the negative square root, we get the inverse; 



-^b/ a' “ 1 





Note that: 
0 



ni 



This calculation shows us that there are two ratios, m and M 

/ 0 / 0 

where the competing driving functions are balanced. Note that yM q = 

0 vrhich is as it should be from the definition of yu . For a 
ratio betvreen the prediction signal and the x process ofyu. vjhere 
falls in the range: 

/“o' < </'o 

the total driving function D(t) is positive. Thus the z process is 

being driven in the excitory direction, . Note that the bounds Jj. ^ and 
•I* 

yA. Q of this region are both positive. Since wo do not allow negative 
prediction signals, this means that D(t) is positive only when both 
X (t -t) and x.(t) are positive in conformity with . Outside 

c X 2 

the region yu. ^ < yu < , D(t; is negative and the z process is 

being driven in the inhibitory direction. 

The ratios Jx ^ and its reciprocal j\ ^ are called the cross 
over ratios for obvious reasons. By specifying a and b to result in 
a partic\0.ar cross over ratio, vre can specify a sort of "floating" 
threshold on the z process. The thresholds we have considered previ- 
ously have all been "fixed". That is, the amplitude of the process 
they viere thresholding was compared to their fnxed value. If it was 
greater than this fixed value a^e got a different resiilt than when it 
was loss. The floating threshold in the z process vmder consideration 
is a f\inction of the ratio of the amplitudes- of the prediction signals 
and the x process. If this ratio falls in a cei-tain range we get one 
result and if the ratio falls outsido this range vre get another. The 
range is completely determinod by the constants a and b. • 

One further analytic property of D(t) is that it is a convex 



function of the ratio 



jx , It therefore has a maximum vrith respect 
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i 




to jx tihich v:e compute; 

^D(t) 

" -2 u (b - a) - 2(a + b) = 0 

3/1 ' 

or the maximxmi of D(t) •^ri.th respect to ^ occurs at: 
a + b 

jx max = 



a - b 
note that jx ^ < 



yu max < /-o * 

This says that the maxmum "force" driving a z process in the 



excitory direction occurs when the prediction signal and the x process 
are in the ratio, y^max, to one another. There is no minimum to 
D(t). Thus the driving function D(t) seems to be biased in favor of 
driving the z process in the inhibitory direction. To compensate 
for this and to cover the initial inhibitory biasing of this z process, 
we will artificially bound D(t) on the negative side. That is, we will 
use a driving function D(t) defined by: 

D*(t) = *** 

where M„ > 0 



and where: 



S^iy) = 



1 ±f y ± 0 
.0 if y < 0 



By the proper selection of M , z . (t) will be prevented from assuming 

^ C X 

large negative values that vxould totally inhibit the learning of excitory 



associations. 



section 7«5 



Specification of the Parameters in an Outs'tar Confoi’ming 
to Logic (C 2 

By incorporating the equation for a 2 process developed in the 
previous section, \-re get tho equations governing an outstar conforming 
to logic JC 2 * 

7.5.1 

7.5.2 x^(t) = “CiXj^(t) i Pj^(t) -:'p2^^(t)Xp(t - t) 

7.5.3 “ -uZ(j^(t) + ( a(x^(t - f ) + x^(t))^ - 

b(x^(t -t) - x^(t))^j 

With this formulation, the z processes in the arrowheads of the directed 
edges from the command node can learn inhibition. If they do, then the 
excitement of the command node xjill result in direct inhibition of the 
grid nodes. For this reason, the outstar governed by equations 7.5 
vrill be called a directly Inhibiting outstar . We will run the same 
e 3 q 3 eriraent that ire have used on other outstars. Therefore the parameters 
of the directly inhibiting outstar are specified to be the same as in 
the other outstars except where there arc special considerations to 
be raados 

Input parameters: 

The input shape is rectangular 
A = 10 

^ = 0,3 seconds 
Network paramieters: 

“ = 3.3333 seconds"^ 

^ = 1 



■t = 0,3 seconds 



Network pai'ameters continued: 



N = 3 

Initial condition on all variables is zero. 

The presentation rate for presentations' and/or predictions will 
be 1.8 seconds, u will be specified such that the decay time l/u 
for the z processes toll be ttiice the presentation rate: 
u = 0.278 sec. = 1/((2)(1.8 sec.)) 

To specify a and b we must select the cross over ratio ^ - if jx^ 
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X (t “ T) 

c 



is the ratio betvreen those functions at which the 



comjjeting driving functions in the z process balance, A cross over ratio 
+ 

of /.p =11 .5 was selected arbitrarily. Thus: 
b/a = (( /* -!■ i)/( ^ - 1)) ^ = 1.M4 

ArbitrarilLjr, b •t^as selected to be b =1, 

Therefore a ~ 0,70?, ~ 

With these parameters , v VTas experimentally determined on the two 
presentations mean well learning critei'ia. The valxie of v so determined 
vras : 



V = 0.25 

M the lower bound on negative excursions of the z processes 

reejuires some thought. M should be specified luch that an amplitude 

of z .(t) = -M will not prevent learning of excitory associations, 
ca z 

Consider equation 7*5«2 for the x processes when z .(t) ~ -M ; 

ca z 

x.(t) = -ax (t) + P (t) - /3 M X (t -t') 
a i i / z c 

If the node is being excited by an input pulse v;e wan'b the combination 
of iiiputs to V^, 



ns 



P.(t) - fl M X (t -r ) 
a I z c 

to be sufficiently positive to drive x^(t) to values such that: 



x^(t - 'IT) 
x.(t) > ° 



/ 



0 



x^(t - t) 
11.5 



If this condition is met, then the driving function for will 

be positive and z^^(t) vjoll move avray from the value z^^(t) = -Mj, 
in the cxcitory direction. In such a situation, the out star will always 
be able to learn that the command node is oxcitorally associated with 
a grid event by sufficiently many presentations. 

Analytically, the maximum amplitude for x (t - ) is (A/a)(l - e“^). 
If \re make P^(b) - fi M^(A/oc )(1 - e”^), then we could expect the in- 
hibitory input ^ M^x^(t - 'T ) and the excitory input to approx- 
imately cancel. In this case, “ /(I " ~ 5*28, Since 

fi = 1 , we therefore want < 5*28 at least. To allow room for errors, 

M = 2.10 was selected, 
z 

To investigate the effect of random occurances of the command 
event in inhibitorali.y biasing the outstar before it "goes to school" , 
the command node alone v;as excited once before presentation of the pattern. 
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section 7.6 Experiments va.th a Directly Inhibiting Out star 

Figure 7t6,l shw-rs tlie results of performing an experiment with 
the directly inhibiting outstar specifjed in the last section. Note 
that excitement of tho command node alone at the beginning of the 
experiment resvilts in sraall negative amplitudes for the z processes. 

The directly inhibiting outstar is thus slightly inhibitorally biased 
before "going to school”. "School" begins with the second presentation 
of the command event. From the x^Ct) trace it can be seen that the 
pattern was approximately well learned in tifo presentations. 

Event 1 is not presented. The z^^(t) trace shows that the outstar 
has learned to directly inhibit grid node 

Event Z was presented vrith = 0 presentation phase vrith respect 
to the arrival of tho prediction signal, (Presentation phase has been 
explained in section 3*5») Event y as presented with presentation 

phase <P = 0,6 seconds after event 2, As can bo seen from the x^(t) 
and z^^(t) traces, the outstar has learned to inhibit grid node V^, 

The experiment vra,s continued to test the resistance of the directly 
inhibiting outstar to random mistakes in the pattern. Figure 7 *6,2 
shows the resxvLts, Event 1 is tho simulated random mistake. As can 
be seen from the x^(t) and traces, the direct inhibition the 

outstar learned before the occurance of this mistake resulted in little 
damage to the pattern, rose to a small positive amplitude 

which is decaying. The prediction following occurance of the mistake 
did not cause ^^^(t) to increase. Thus we may conclude that the outstar 
will forget the mist.ake entirely in time. 

The experiment vias continued to test the correctabdlity of the 
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the previously learned pattern V— ♦Vp, Event 1 is the mistake 
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Figure 7.6.3. The result of attenrting to correct the previously 
learned pattern V— V ’.vith the correcting pattern ^^^ 3 * 
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4 



directly inhibiting outstar. Figure 7.6.3 shows the results. An attempt 



i-7as make to correct the previously learned pattern V — t~V vrith the 
correcting pattern V — -p-V^ by presenting V — ►V twice. Figure 7.6.3 
shows that the attempt vjas unsuccessful. The first presentation 
of V — was treated like a random mistake. The previously learned 

C y 

inhibition of was sufficient to prevent Zg^(t) from rising to much 

of a positive amplitude. The newt presentation of did result 

in a healthy increase in z^^(t). Further presentations of 

will result in 3.t being learned better. However, V — *-V- x-jas not 

"•unleai’ned" during this time. Both excitements of the command node 

resulted in approximately well learned responses by X 2 (t), 

The only method by which this directly inhibiting out star can 

correct a pattern is to forget the old pattern VThile learaing the new 

pattern. From the trace we can see that the presentation rate 

for V — e-V VTas just right to result in "pumping up" z „(t) such 

c 3 .■ 

that V — remajjned well learned during the correction attempt. 

Thus the outstar could not forget ^2 learning V^, The 

addition of lateral inhibition and/or increasing the forgetting rate 
u would probably increase the correctability, but these options were 
not investigated. 

In the discussion of the logic ^ section 7«^i it was noted 
that random excitation of the grid nodes without excitation of the 
comraand node might result in iiihibition of a learned pattern. The 
assignment; 

X (x = 0, X. = +1) = -1 
2 c X 

is the source of this possible trouble. It vias decided to see if this 

vras jndeod a problem. All the grid nodes were excited twice without 
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Figure Rej^ult of excitement of grid nodes nlone 



exciting the coniraand node. The command node was then excited to see 
what would be predicted on the grid. Note that because of the un- 
successful correction attempt, the outstar had learned V — (V., V„) 
at the end of figure 7.6,3. 

Figure 7.6,4 shows the result, z ^(t) and z ^(t) vrere very 

c2 c3 

slightly driven in the direction of inhibition by the grid node 
excitements. However, as the pi edict ion shows, the pattern — ^^^2* ^3^ 

is still in the outstar' s memory. It can still be completely recovered 
by "pumping up". 

This result does not mean that there is no problem with random 
excitements of the grid nodes in an outstar conforming to eC^. 
only moans that it is not a significant problem in the outstar under 
study. 
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section 7,7 



Generality of the Fornmlation of tho 7 , Process 
Conforming to Logic cC ^ 

The z process formulation conforming to logic that we have 
used is: 

7.7.1 ^ -'r ) + 

x^(t))2 - b{x^(t -r) - x^(t))^) 

where : 

1 1 if y i 0 

S .(y) ={ 

[O if y < 0 

As vras shoxm in section 7»5» setting a = b in equation 7 •7*1 
will result in: 

7.7.2 = -uz .(t) >(M + z (t))2v(a + b)x (t -r)x.(t) 

Equation 7*7*2 describes a z process conforming to the neur-trally 

biased logic cC^ of table 7»3*1. By setting 0 in e^quation 7*7*2, 
we get the excitory logic of table 7*3*i which has been the logic 
we have used in the simple and laterally inhibiting out stars. Thus 
the z process formulated by equation 7* 7*1 is rather genreal. By 
specifying the parameters a, b, and we have a choice of which logic 
and what type of out star we shall get. 

The general application of equation 7#7d does not end there. 

By appropriate specification of the parameters a, b, M , and v wo can 
make a z process governed by it "practically inhibitorally biased". 

Suppose, for example, that we v:ished to make a laterally inhibiting 
outstar. Me connect all of the grid nodes with directe-d edges and 
arrowheads. Previously we have used z processes with a permanently 
assigned negative value to get laterally inhibiting prediction signals. 
However, we can nov: make all the z processes in the network conform 




4 



to equation 7.7.1. By proper selection of a, b, and v vre can make the 
z processes in the laterally inhibiting arrovrheads negative most of 
the time. 

To do so, x:e depend on the statistics of the environment.' It is 
unlikely that ant tvro x processes in the grid vril-l be excited to 
identically equal amplitudes for very many times in succession. 

•J- / 

Therefore, by specifying the cross over factors y-'- ^ = 1/ = 1, 

we can be almost certain that the z processes in the arrowheads will 
leam inhibition. 

An experiraent was conducted to test this conclusion. Tv?o nodes, 

'V and Y were connected by a directed edge as shown at the bottom of 

Jm ^ 

figure 7.7.1. The originating node, was excited four times in 
succession by input pulses. The "receiving" node, V^, xras excited 
tx-xice exactly when the prediction signal arrived at the arrowhead. 

The parameters used in the experiment were: 

Input parameters: 

Input pulse shape is rectangular 
A = 10 

5 = 0.3 seconds 
Network parameters: 

All initial conditions were zero 
3.3333 seconds”^ 

(i = 1 

u = 0.278 seconds" 

T - 0,3 seconds 
v = 1.0 
a = 0.12 
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Figure 7.7.1. Demonstration of a a process which loams inhibition. 



NetT;ork parameters continued: 



b = l,u5 

U = 2.10 

z 

From the selection of a and b, the cross ovex* ratio 
was computed to be: 



+ 



/o 




0 




Figure 7.7.1 shox'js the result. The initial excitement of alone 
resulted in the process being driven to its negative limit, 

-M^. The two presentations of event 2 exactly at the time that the 
prediction signal x^(t -'T^) arrived at the arrovihead resulted in 
Zi 2 (t) being driven to a positive amplitude. However, the fourth 
excitement of resulted in returning to inhibitory values. 

Thus we may conclude that the ^^^(t) process vrill behave as an inhibitory 
process most of the time. Note also that we did not have to specify 
the cross over factor to bo exactly 1 to get this result. 

Of course, specifying a = 0 in equation 7*7#! XTOuld make the z 
process always inhibitorally biased. The above experiment was conducted 
to show that we did not have to go to this extreme to get the desired 
results. 

If we go to the other extreme and specify b = 0 in equation 7 • 7 • 1 » 
we get: 

7 . 7.3 z .(t) = -uz ^(t) + va 5 . (M + z . (t))(x (t -t) -1- x.(t))^ 

This formulation xvTil3. result in the z process being driven to 

positive amplitudes vjhen ever ~T)i or x^(t), or both, are non 

zero. Thus we can replace the permanently assigned positive z processes 

in the command node cascade in an avalanche with "learning" z processes 

that are governed by the same general equation as all the other z 
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processes in the avalanche. 

The z process foninHation given by equation ?,7»i is therefore 
general enough to be used in all the applications we have fovind .for 
z processes in outstars and avalanches. We covild specify that all 
the z processes in a netvjork be governed by this formulation. The 
special featvires of the netx’/ork such as a command cascade or lateral 
inhibition can be implemented by appropriate selections for the 
parartieters a, b, and Thus the design of an outstar or an avalanche 

could be reduced to specification of these parameters at each of the 
arrowheads in the network. 
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CH/iPTER 8 



THE CHEMICAL OUTSTAR 



section 8,1 Introduction 

At this point there are three outstanding promises made in the 
previous chapters. In the introduction to chapter one, it was promised 
that this thesis would examine Grossberg's theoretical proposal for 
the neurophysiological processes that alloX'j a living organism to 
learn. In chapter five it was promised that a solution to "pulse 
lengthening" in a cascade of nodes would be developed. In chapter 
seven it vras promised that an examination of a logic corresponding to 
logic in table 7,h,l would be made. 

We shall keep these promises in this chapter, A synthesis of 
all three will be developed and ve shall examine its performance. 
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section 8.2 The Analogy Betoecn Snbedding Field Networks and the 
Nervous System of Living Organisms 

Figure 8,2,1 shows the analogy between embedding field network 
elements and the elements of the nervous system of a living organism, 

A thorough perusal of figure 8,2,1 would explain this analogue to the 
reader better than volumes of xrords. 

For the uninitiated, a brief description of the neurophysiological 
elements and processes shox^m in figure 8,2,1 is offei’ed. The dark 
coll body and axon shoi-m is an intemeuron in the spinal co3.umn of 
a vertebrate. The light cell body and axon is a motoneuron. Neurons 
are living cells. They occur in organisms in a variety of shapes. 
However, they alviays consist of a I’easonably elongated part called 
an axon, and a "fatter" part called the cell body. The cell body 
contains the coil's nucleus. An intemeuron and a motoneuron were 
chosen for figure 8,2,1 because they have been extensively studied and 
the infomation shown was easy to collect. 

The traces shown are voltages recorded by microelectrodes inserted 
into the interneuron and the motoneuron at the places shown. These 
recordings correspond to the following seqxiencos of events: The 

intemeuron is excited by an electrical signal delivered to the cell 
body by a micrpelectrode. This signal results in the membrane potential 
of the cell body rising from its resting potential of approximately 
-?0 mV, There are tvro parts to this positive increase in the cell 
body membrane potential: The excitory post synaptic potential, EPSP, 

and the action potential (spike). The EPSP is the lovrer trace which 
is shoxm as a solid line. If the EPSP does not rise to suprathreshold 
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values, then it is the only signal recorded at the cell body. Further 
a subthreshold EPSP does not result in an action potential (spike) 
being propagated down the ason. 

When the EPSP rises to suprathreshold values, a spike is propa- 
gated do\im the axon. In addition, the spiko is "reflected" back into 
the cell body giving rise to the dotted line spike trace showi super- 
imposed on the EP^P. 

The spike is formed at the point \-rhere the cell body narrows 
down to form the axon. It propagates down the axon at a finite velo- 
city which is on the order of 5 meters/sec, to 100 meters/sec. The 
typo of neuron and the covering on the axon detemines the propagation 
velocity. In a particular type of neuron, the propagation velocity 
is fixed. All spikes are transmitted at the same velocity. Spikes 
also alvjays have the same amplitude and shape. 

The end of an axon generally breaks up into a number of collaterals. 
Each collateral ends in a swelled portion called a bouton. These 
boutons are located imined lately adjacent to another neuron's cell 
body. The bouton-cell body junction is called a synapse. For this 
reason the geometric arrangement of the neurons shown is described 
as an intemeuron "synapsing" on a motoneuron, V/e have shown the spike 
propagated dovm the axon as it arrives at the synapse. Note that it 
is delayed due to the finite transmission velovity, 

A spike arriving at a synapse causes the adjacent cell body 
membrane potential to rise from its resting potential irith an EPSP, 

If the EPSP rises to suprathreshold values, a spike is propagated 
dovm this neuron's axon. 
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There is a short delay between the arrival of a spike at the 
synapse and the beginning of an EPGP at the adjacent cell body. This 
is because the cell body being synapsed upon is not excited electrically 
by the spike. Instead, the spike causes the release of a chemical 
substance in the space between the bouton and adjacent cell body. 

This chemical substance is called transmitter. It causes the EPSP 
in the synapsed upon cell body by changing the cell body's pe'^mcabil.ity 
to different ionic species, 

A magnification of a synapse is sliot^n. The space betvreen the bouton 
and the cell body is called the synaptic clePt, Under an electron 
microscope, the synaptic cleft is revealed to hold a number of small 
particals called vesicles. It is currently believed that these vesicles 
are packages of transmitter which burst open when a spike arrives at 
the synapse. 

The reason for these voltage traces is relatively easy to understand, 

* 

A neuron is surrounded by an interstitial fulid in which various ions 
are dissolved. The interior of a neuron is also a fluid like substance 
in which ions are soluble. The boundary between the interior of the 
neuron and the interstitial fulid is a membrane which is selectively 
permeable to ions. In a neuron at rest, the membrane is permeable 
to potassium ions, K+, but reasonably impermeable to sodim ions, Na+, 
There is additionally a "sodium pump" in the membrane which continu- 
ously ejects Na+ ions from the neuron's interior. To maintain electrical 
and chemical equilibrium of the overall system, there is a higher 
concentration of K+ inside the neuron than outside. The reverse is 
true for Na+, The result is that the interior of the neuron is approxi- 
mately 70 milli volts negative with respect to the interstitial fluid, 

m 



Electrical stumulation of the raembrane results in a sudden change 



in the nonbrane pemeabil3ty. The racBibrane becomes penneaMe to 
Na+ ions and they diffuse into the neuron. This results in a sudden 
increase in the voltage of tho neuron's interior with respect to the 
interstitial fluid. In a very short tirae the membrane regains its 
irapemeability to Na+ ions, K-!- ions then diffuse out of the neuron 
to redress the equilibrium and the potential across' the membrane drops 
to the resting potential. The net effect is a small loss of K+ ions 
and a small increase of Na+ ions inside the neuron. The sodium pump 
will redress this in short tirae. Thus with microelectrodes inserted 
into the neuron the potential across the membrane can be measured 
and electrical traces similar to those shoxm can be recorded. 

Release of the trandraitter substance in the synaptic cleft by a 
spiko causes similar membrane permeability changes \'ihieh result in 
an EPSP, 



Next to the neurons we have shown the geometrical elements and 
processes which occur in embedding field elements, Grossberg has pro- 
posed the folloi'Ting analogy between the neurophysiological phenomena 
in an organism and embedding field theory; 
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Except for the lest correspondence, figure 8.2,1 shox^s that the 
analogy is in general very good. There are differences in detail xrhich 
v;e xxill take the time to explain here. 

The X processes shox^ii are not divided into an EPSP and a super- 
imposed spike. Further, the maxiraxm amplitude of the prediction signal 
is directly proportional to the amplitude of the x process x^hich, in 
turn, is directly proportional to the amplitude of the input pulse . 

The amplitude of a spike on an axon is constant and independent of the 
amplitude of the signal exciting the cell body, 

Hox^ever, the situation we have shox-jn on the intemeuron is the 
response to a single excitation of short duration and limited amplitude. 
In the usual case the EPSP is suprathreshold for a reasonably long 
time. This results in a barrage of spikes being propagated doxm the 
axon. The frequency of these spikes is proportional to the strength 
of the stimulus exciting the cell body. ^ In Grossberg's proposal, 
the amplitude of the portion of the x process that is suprathreshold 
is considered to bo proportional to the spiking frequency in a neuron. 
Thus a prediction signal represents a barrage of spikes. 
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section 8,3 Sximinary of the Theoretical Proposal for the 
Neurophysiological Process of Lea.ming in 
Living Organisms 

We have seen that an outstar network composed of embedding field 
elements is capable of learning. The key to this ability is the z 
process at an arrowhead. The z process at an arrowhead correlates the 
prediction signal arriving at the arrowhead vrith the x process at the 
adjacent node. It remembers this correlation in its amplitude and 
allo^^s prediction signals to excite the adjacent nodo proportional to 
its amplitude. By writing doim the equations governing the embedding 
field network shoTm in figxire 8,2,1, we can see this clearly: 

8.3.1 Xj^(t) = - OlXj^(t) + Pj^(t) 

8.3.2 ^2^^^ ~ -ttXgCt) •!- P2(t) 'f- p Zj^2(t)|^x^(t -f ) - 3 ^ 

8.3.3 v[x^(t -T ) ~ T^]"*^[x 2 (t) - 

where : ' 

+ fy if y > 0 

L yj r= - 

[0 if y 0 

From the x^Ct) trace is figure 8,2,1, we can conclude that 

already learned that and are associated. That is, ^ ® 

is of sufficient amplitude to result in a vxell learned prediction response 

by XgCt), 

In order for the intemeuron in figure 8,2,1 to excite the noto- 
neuron with spikes, there must be transmitter substance in the synaptic 
clefts. If we make the amount of transmitter substance released by a 
barrage of spikes proportional to then the 

equations governing the embedding field network could accurately 



describe the noi-vous netwoi'k. If V70 further wade the amount of trans-* 
mitter substance available for release proprotional to the amplitude 
of • “then equation 8.3.3 coul.d describe how, why, and how much 

transmitter substance is available in the synaptic cleft, Grossberg 
has proposed this as a concrete theoretical explanation of the neuro- 
physiological phenomena underlying learning in living organisms. His 
proposal is that transmitter sixbstance is produced in a synaptic 
cleft at a rate proportional to the correlation of the frequency of 
spikes arriving at the bouton and the membrane potential and/or spiking 
frequency of the adjacent cell body. He has proposed additional re- 
finements and an exact mechanism vrhich gives this result in reference 

It is doubtful that the ability of an intemeuron to excite a 
motoneuron in the spinal eolu 2 an of vertebrates is learned. As we have 

'A 

said, the neurons selected for figure 8,2,1 were selected because of the 

» ' 

extensive information that has been collected on them. However, the 
arrangement of neurons in the meduHa , cerebellum, and cerebrum of 
vertebrates is similar and we do know that learning occurs in these 
organs. The similarity between the embedding field network and the 
nervous network in figure 8,2,1 is uncanny, Grossberg has shown 
theoretically, and wo have shoi'in experimentally, that embedding field 
networks can learn. Thus Grossberg ’s proposal could explain learning 
in organisms at the microscopic level. The proposal is even more 
attractive v.’hen it is recalled that embedding field theory originated 
at a model for the macroscopic psychological phenomena of learning. 

This thesis originally intended to simulate Grossberg* s proposal 



in detail and canpare it to existing neurophysiological experimental 

m 



data. Hovfevor, the time vjas not avaulablo, A siinplistic stab vfas 
made an this direction. The reason was that nervous networks are 
capable of transraittang a signal through a cascade' of neurons without 
"pulse lengthening" ocouring. To solve this problem in an embedding 
field node cascade, an attempt vras made to model the embedding field 
elements more closely to neurophysiological elements. At the same time, 
attempts to implement logic of table 7,4,1 in an outstar were being 
made. The simplistic model of neurophysiological phenomena proved to 
be an logic. Because of these diverse reasons, the simplistic 
model arrived at in this thesis is quite different from Grossberg's 
proposal. In the next section v?e shall derive this model in a somewhat 
logical manner. The reader may be assured that this was not the 
historical progress of the model. 
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section 8,4 A Simplistic Model for the Neurophysiological Phenor.iena 
in a Noit?ous Network Based on Snbeddijig Field Theory 

Suppose that vre had two neurons, and V^, arranged as in figure 

8,2,1, Suppose further that exiciteraents of the first neuron, V^, 

only results in one spike being generated per excitement. Also 

suppose that we covild excite the cell body of the second neuron, V^, 

with an input. As in embedding field theory, we are not concerned 

here with how tliese inputs are delivered to the cell bodies. For the 

sake of argument, suppose that transmitter substance is produced in 

the synaptic cleft at a rate proportional to the correlation between 

the membrane potential of a bouton and the membi'ane potential of the 

adjacent cell body, V , For the purposes of this discussion, we will 

assign a value of zero to the resting membi'ane potential at the bouton 

and let ^c^Ct) be the membrane potential of the cell body. Let 

z (t) be the "amount" or "concentration" of transmitter substance 
12 

present in the synaptic cleft. From our previous vrork we have a choice 

of two fomiilations for z. (t)i 

12 

8.4.1 z (t) = -uz (t) + vx (t -'f)x (t) 

12 12 2 

and the more general f emulation: 

8.4.2 ~ ^ ^ ^12^^^^ 

[a(x^(t -r ) + - b(xj^(t -r ) - x^Ct))^ 

Now, we run into a probleia, ) Jind voltages, 

z (t) is the rate of production of a chemical transmitter substance, 

12 

What are the chemical reactants which produce the transmitter substance? 
Hcti'T does it come about that a chemical substance is being produced at 
a rate proportional to the product of voltages? 

m 



In our brief description of how laembrane potentaals come about, 

v;e saw tliat these potentials are due to chctnges in the ionic pemne- 

ability of the neui'ons’ membranes # Suppose that ah Son or substance 

diffuses or is released from the bouton vihen the membrane pemeability 

is changed by arrival of a spike# V/e will call this substance *'B" 

substance. Suppose further that a different ion or substance diffuses 

or is released from the cell body when its membrane permeability is 

changed by an EPSP or spike. We will call this substance "C" substance. 

Suppose further that "B" substance and "C" substance are the reactants 

which produce the transmitter substance. Since the transmitter substance 

results in excitation of neuron V^, we vrill call it "excitory transmitter 

substance" , or simply "E" substance. 

How do the B and C substances combine to produce E substance, 

and why vjould the rate for tliis reaction be proportional to tlie product 

of voltages? The rate of reaction for biochemical reactions may be 

* 

governed by many thjjigs, including voltages. Due to the complexities 
of biochemical processes, we could blatantly assvime that the rate 
of reaction for the combination of B and C substances into E substance 
is proportional to the product, or the squares of the sum and difference, 
of tvTo voltages. However, we need not make this blatant assumption. 

It is possible to allow B and C substances to combine accoi’ding to a 
very siraple chemical reaction and this will rosiilt in all the desired 
properties for production of E sxibstance. The remainder of this section 
will be devoted to this simple chenical reitetion and its implications. 

Let B and C substances combine to form E substance according to 
the chemical reaction: 

8,i4-,3 b-B + c-C ^ i-E 
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vjhere b is the nvanber of moles of B and c is the miraber of moles 
of C required to prodvice ono mole of E» 

Let this reaction occur instantly at body tesnperatures. That is, 
if b moles of B a/nd c moles of C are rolea.sed into the synaptic cleft 
at time t^, then at any time ■t> t^ only the end product of ono mole 
of E vriJ.1 be present in the cleft. 

We vrill investigate the implications of equation 8,4,3 for the 
production of E substance. The investigation will involve a number 
of tricky conservation of reactants and end product equations. For 
simplicity , \jq will make b = c = 1 in equation 8,4,3, That is: 

8,4,4 1-B 1*C ^ 1-E 

Eqxiation 8,4,4, will be used throughout. However it must be kept 

in mind that eqimtion 8,4,3 is tHe general situation and that we will 

be investigating a special case. 

Let number of moles of B substance released from 

» 

a bouton into the synaptic cleft per second. Let number 

of moles of C substance released from the cell body per second into 

the cleft. We can relate b (t) and c (t) to the rambrane potentials, 

12 12 

x^(t - ) and 

The biochemical process which rcsailts in membrane potentials is 

the selective permeability of the membi'anes, A positive increase in 

a membrane potential is due to an increase in the membrane’s permea- 

bil.ity to sodium ions, Na'i", A decrease in membrane potential is duo 

to a decreased permeability to Na*!- ions. As we discussed in section 

8,2, the net effect of a spike or an EPSP on a neuron is a slight 

increase of Ka'»' ions inside it and a compensating decrease of potassium, 

. K+, Now, suppose that B and C substances are held inside the membrane 
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when it is at rest potential. Suppose fui'ther that they diffuse 
through the EKoibrane v:ith K-i- ions to compensate for a net increase of 
Na+ ions. Since K-!- ions diffuse out of a i.icmbrano when the r-iembi'ane 
potential is decreasing, we can say that! . • 

(a) 

(b) 

Since tho rato of diffusion of K+ ions is proportional tc Ihe rate 
of change of Meinbrano potential, let us go a bit further and say that; 

(c) b (t) = [-X (t 

12 1 

(d) c (t) = C“X (t)]’*' 

12 2 

where ! 



^ ^ v;hen x^(t - ir ) < .0 

c (t) > 0 vjhen Xo(t) < 0 

12 



Cyl"’r= 



■y if y > 0 
if y 0 

In other vroi'ds, this says that the rate of release of B and C 
into the synaptic cleft is directly proportional to the rate of decrease 
of membrane potential, 

Now, vjhat happens to tho B apd C substance when they ai’e I'eleasod 
into the synaptic cleft? If both are being released at the same tir/ie, 
then E substance will be produced. This exactly what wc vxant. It 
says that E substance will be procuded if both x^(t -T') and x^Ct) 
are decreasing at the same time. Although it ignores the increasing 
leading edge of -'() and x^Ct), it does correlate the decreasing 

trailing edges. Further, this process corresponds to knovm physical 
facts. That is, when membrane potentials are decreasing, at least one 
substance from inside the membrane in diffusing out of it, 

Hovrever, there is a catch. Suppose a spike has excited the bouton 
recently, but no EPSP or spike haS' excited the adjacent cell body. 
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Then B cubstanco will have been released into the synaptic cleft and 
there will be a net amomt of it present for all time after arrival 
of the spike at the bouton. Thus, if u fow days latei’, the adjacent 
coll body is excited by a.n EPSP or a spike, E substance v?ill be pro- 
duced, In embedding field teiTas, the association V^-«^ will be 
learned. One of the key tenets of embedding field theory is that 

V_, — V can only be learned when V. and V have been excited in close 
1 2 .12 

temporal proximity. Thus, we can not allovi excess B or C substances 
to accumulate in the cleft. 

There are three methods of preventing excess B or C substance from 

accuiuulating in the cleft. It can diffuse out of the cleft, it can be 

readsoi'bed into the bouton or cell body from which it came, or, it 

can be rendered inactive by chemical reactions. There is no reason for 

prefering one of these methods to another here. We vrill arbitrarily 

■» 

choose the chemical reaction and say that B and C substances are de- 

* 

activated at a finite rate to prevent accumulation. 

Let b (t) be the nvouber of moles of B in the cleft at tir/ie t , 

12 

Let c (t) be the nvunber of moles of C in the cleft at tine t • 

12 

Then vre \rill say that* 

• 

8,4,5 b^^(t) = -wj,-(t) 

12 b 12 

Wo now have a "corrolatii^g" process, Tho amount of E substance 

in the cleft, % (t), will grovr vjhen a spike excites the bouton in 

12 

close teraporal proximity to the excitement of the adjacent cell body. 

It will not grow if they are not 'in close temporal proximity. 

We must now develop a mathematical description of tho production 

of E substance in tho cleft as a function of the membrane potentials 
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x^(t) and x^(t -t). Thus far vjo have reached the folloi'Ting results: 



2 

8.4.7 

8.4.8 

8.4.9 



1 • B 1 ♦ C 



1 • E 



(instantaneo\xs rate) 



b (t) = [-X (t -X)-] 
12 1 . 

c^^(t) = [ 1 



8.4.10 b, Jt) = -w b (t) 

Id D J^2 

8.4.11 c (t) = -w c (t) 

12 c 12 

vrhere : 

x^(t -X ) is the membrane potential of the bouton. 
x^Ct) is the membrane potential of the adjacent cell body. 

is number of moles of B released into the cleft from the 
bouton per second. 

c^ (t) is the number of moles of C released into the cleft from the 
12 

cell body per second. 

b^ (t) is the net number of moles of B in the cleft at time t . 

12 

c (t) is the number of moles of C in the cleft at time t . 

12 

Because of 8.4, 7 » either ^ 12 ^'^) i® zero at any given 



12 



time. Also because of 8,4,7? 



8,4.12 z^^(t) = [rainCb^gCt) , c^^(t))3 



12 



12 



where : 



^ f X if X - y and x > 0 

[min(x, y)] =-(y if y- x and y> 0 

^0 if X - 0 OR y - 0 

This simply says that if there is b^^(t) of B substance in the cleft, 

and we release c (t) < b (t) of C substance into the cleft, then 

12 12 

instantaneously all of the C vri3.1 be used up to produce E, i® 

restricted to be postive because there simply can not be a negative 
number of moles of B or C in the cleft. 
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Equation 8,4,12 vrill describe ^ fxmction of x^(t ~ T ) 

and X (t) if we can dovolcp equations relating b. ^(t) ?nd c (t) to 
2 12 12 

X (t ~ T) and x (t) respectively. Let us consider the conservation 
1 2 

of B in the cleft; 



(i) ~ ~ ) 3 *^dt of B substance is released into 

the cleft per tirae interval dt, 

(ii) [ iain(b.| ^(t) , c^^(t))3 dt of B substance is converted to 

12 12 

E substance instantaneously in 
time interval dt. 

(iii) The reaction to produce E substance is instantaneous. 

Therefore if there is ® substance in the cleft 

and ^ is added at t ~ ^qi then at t > t^ there 

can be at most [ ^2^^^ ” ^12^^^ ] of B in the cleft. 
This is the amount of B substance which •trill bo available 
for deactivation. Therefore there is; 

of B substance deactivated per tine interval dt. 

Therefore; ' 

8,4,13 ~ )3 " V7^[ ^12^^^ “ ®12^^^^ ^ “ 

[ninCb^^Ct), "** 



Similarly; 

i j. - - — — 

8,4,14 c (t) ==: [-X (t)] ‘ - w [c (t) - b (t)] - [min(b. (t), c ( 

2 0 12 12 12 

Equations 8,4,13 and 8,4,14 coxipled trith 8,4,12 completely describe 

the process whereby the voltages x (t -x) and x (t) are converted into 

1 2 

the chemical substance E, As they are rather complicated, a syabem 
diagram was drawn and is shovm in figvu’o 8,4,1, 
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F/G(J/?£ 8 .^-! . SysTEn DIAGRAM FOR PRODUCTION OF 

E SUBSTANCE. 




A signal from ono neuron is transmitted to another by the release 
of trans:nitter substance in the sjoiaptic cleft. Having developed a 
model for the production of trantTnittei’ substance, wo must not model 
how this substance is used in the transmission of signals. Let us 
assume that the transmitter substance produced by our reaction is con- 
tained in the vessicles in the synaptic cleft, Undei* normal cii'cum- 
stances, it is safely packaged in these vessicles and tumble to affect 
the permeability of the adjacent cell body membrane. However, when 
a spike arrives at the bouton, the vessicles suddenly burst and the 
transmitter is released to attack the cell body membrane. How does 
the spike cause the vessicles to burst? 

Again since we are dealing with a biochemical system, there is no 
obvious method. Let us consider the events associated with the arrival 
of the spike at the bouton and see if there is any reason for the 
vessicles to burst, Arrival of the spike at the bouton begins 'vrith a 
rapid diffusion of Na+ ions into the bouton. Here we have two possible 
reasons for the vessicles to burst. Firstly, before the arrival of 
the spike, the bouton and the cell body are at zero potential to one 
another, VJhen the spike begins to arrive at the bouton the potential 
of the bouton rapidly increases relative to the potential of the cell 
body. Thus we could conceive of the vessicles being pxilled apai'^b by 
electrostatic forces. This w-ould require dipolai' vessicles. One end 
of the vessicle would have to be at a different potential with respect 
to the other end. If transmitter wore released by this method, then it 
would most likely be released before the spike peaks. 

On the other hand, we could conceive of the vessicles bursting 

due to the sudden infusion of Na+ into the bouton. The detailed 

ZOl 
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mechanism vTonld req-uire that the normal Na+ concentration in the syn- 
aptic cleft be greater than that inside the bouton as is the case 
with the interstitial fluid siirrounding the neuron. Then the beginning 
of the arrival of a spike at the bouton would caxise the Nat to.oiffuse 
out of the cleft into the bouton. Since the volume of the cleft is 
small compared to that of the bouton, this process would rapidly deplete 
the cleft of Ka+. If sodium is required to keep the vessicle together 
they would como apart when a spike arrives at the bouton. Another 
mechanism that would have the same result vrould be to surround the 
vessicles with a membrane that is permeable to Na+ and H^O. Then the 
sudden depletion of Na+ in the cleft xTOuld also deplete the vessicles 
of Nat. The result would be an osmotically compensating insurge of 

H 0 into the vessicles. With sufficient Na+ depletion, enough H O T.Tiii 

2 

enter the vessicles to burst them samolarly to hemolysis in red blood 

cells. (Ref. 12 , p.l3) Again this method vrould release transmitter 

♦ 

most likely before the spike peaks. 

We could conceive of other mechanisms to cause vessicles to burst. 

Hovxever, vro have two likely candidates vrhich cause them to burst before 

the spike peaks. Our process for the production of transmitter begins 

to operate after the spike has peaked and begun to decay. 

If the process which releases transmitter operates at the same 

time as the production process, we will be releasing the transmitter 

that we produce. Thus, to make our system work vrell, we must separate 

the transmitter release and production process. For this practical 

reason, and the fact that it could viork, ve irill release transmitter 

when the bouton membrane potential is increasing. That is, transmitter 

will be released when; Xj^(t - T" ) > 0 

zoe 
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VJe must noi-x decide how raucli transmitter is released. For simplicity 
let VIS assume that all the transmitter in the cleft is released when 
the bouton membrane potential begins to increase. VJe wxl.l further 
assume that all the released transmitter immediately changes the per- 
meability of the adjacent cell body membrane and results in an immed- 
iate increase in the cell body's membrane potential. Note that this 
implies that arrival of the spike at tho bouton causes an iaipulsive 

excitement of the adjacent cell body# 

V/e need to decide one further thing. Release of one mole of E 

substance vrill result in a cell meiabrane potential of how many volts? 

VJe loll arbitrarily sat that release of one mole of E vrill result 

in a cell body membrane potential in.crease of a volts. 

In summary, our transmitter releasing process does the following: 

Suppose that there is i^oles of E present in the cleft. Then any 

increase of the bouton merabran<^ potential above resting potential, will 

cause tho release of moles of E. This will in turn cause an 

immediate increase in the adjacent cell body membrane potential of 

az (t) volts. The E released is used up causing the x (t) membrane 

12 

potential to increase. Thus 0 hiimediately after release of 

the E, 

In order to use this simplistic model for the production and release 

of transmitter- hi the synaptic cleft, we must also model the membrane 

potential responses of cell bodies and axons. At the beginning of the 

modeling process, we said that we were only interested in the propagation 

of a singlo spike across the sjnmptic cleft. Our model for the membrane 

response at other parts of the system thus need only acco-unt for a single 

spike. Rather tlian going through tho laborious process of finding 

20 ^ 









I 







processes vjhich will exactly duplicate the membmae potential traces 
shox^m in figure 8,2.1 » we i>iill adopt the foimilation for x processes 
at a node in an embedding field. Further, ■vfe will' not consider thresholds 
in this study. 

With these assiimptions, suppositions, and modeling results, we 
are in a position to write dowTi a coraplete set of equations governing 
this simplistic model for a nervous network. We vrill summarise the 
notations used and then write doi-m the equations. 

The eqtiations amd notations will bo presented in a generalised 
form. Since this is just a refomulation of the embedding field net- 
work equations, we will number the cell bodies in a nei’vous network 
and refer to them as the "V/' cell body. All synapses between boutons 
connected to the cell body by axons and the V. cell body •V'rill be 
referred to by the dual subscript ij. The firs't, i, subscript shows 
the direction a signal is coming from and the second subscript shows 
the direction it is traveling toward across the synapse. 

Chemistry: 

B substance is a chemical sxibstance released from a bouton into 
a synaptic cleft vrhen the bouton's membrane potential is decreasing, 

C substance is a chemical substance different from B substance 
which is released from the cell body into synaptic clefts vjhen the cell 
body membrane jx>tential is decreasing. 

E substance is excitory transmitter substance. It is produced 
by the instantaneotis reactioi: 

1*B + 1*C ^ 1*E 

At all times when the bouton membrane potential is at resting potential 

or decreasing, the E substance is stored in the synaptic cleft and is 
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\mable to affect membrane potentials. When tho bouton membrane potential 
is increasiiig, all the S substance in the cleft is immediately released. 
V/hen it is released it iraniodiatoly caused an increase in tho adjacent 
cell body membrane potential of a volts per mole of E substance released. 
The E substance releasee is used up causing the cell body membrane 
potential to increase. 

Variables: 

P^(t) is an input signal delivered directly to the cell body 
from the environment, 

• Xj^(t) is the cell body membrane potential of the cell body 
in the nervous network, 

Xj^(t ~ "X) is the membrane potential of the boutons connected to 
the coll body by axons, 

z. .(t) is the number of moles of E substance present 3n the synaptic 
cleft between boutons connected to the cell body by axons and the 



V. cell body, 

bij(t) amount of B substance in moles in the ij synaptic 

cleft at time t, 

c^j(t) is the net amount of C substance in moles in the ij synaptic 
cleft at time t. 

Constants : 

01 is the decay rate for membrane potentials, 
vfjj is the deactivation rate for B substance in a synaptic cleft, 
w is the deactivation rate for C substance in a synaptic cleft, 
a is the released transmitter effectiveness factor on a cell 
body membrane potential. One mole of E substance released in a synaptic 
cleft results in an increase iii the adjacent cell body's membmae 
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potential of a volts. 

T is the interval between origination of a spike at a cell body 
and its arrival at the boutons attached to that cell body by axons. 



The equations governing the system's performance: 

8,4,15 x^(t) = - «x. (t) i P. (t) + aS R(x .(t ~ T ))z . . (t) 

^ 1 1 j 0 Ji 

where : 

R(i.(t -r))z,,(t) 

%) J ^ 

is a special function defined by: 

0 if x.(t -r)t 0 

«J 

^an imp'olse of amplitude when x^(t - 1' ) 



R(x,(t -r))z..(t) =• 



> 0 



8 . 4,16 ” [rain(b^^(t), c^^(t))]**^ - R(x^(t ))z^^(t) 

where: 



[min(x, y)] 



+ 



X if X i y and x > 
y if y - X and y > 
0ifx<0ory<( 



0 

0 



8 . 4,17 b..(t) = [-x.(t -r)]'*’ - w^Lb..(t) - c.^Ct)]"*^ - 

[min(bj^(t), c^^(t))]*^ 

where : 



[y.3 



+ 



'y if y > 0 
,0 if y t 0 



8.4,18 c..(t) = [-x-Ct)]"*^ - w [c..(t) - b.,(t)] “ [min(b..(t), 

ji 1 jj. ji ji 



Ji' 
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section 8,5 Experiments v;ith the Simplistic Neurophysiological 
Model 

Equations 8,4,15 throtigh 8,4,18 look formidable. They VJore 
simulated on a digital computer and it was experimentally verified 
that they work, A simple network consisting of one neuron, , synap- 
sing on another, used. Since the method for the excitement of 

a cell body by transmitter substance is an impulse, the external 
inputs P^(t) and ^2^^^ were specified to be impulses of amplitude 
10, The remaining parameters were selected arbitrarily to be; 

= 3.3333 sec."^ 

= 0,3 sec, 

w, = oo 
b 

Wc= 0,5 

a = 1,0 

Figure 8,5.1 shows the results, The impulse input ^2^^^ 
presented to cell body exactly at the instant that the first spike 
x^(t - f) arrived at the 1,2 synapse. Thus the signals x^(t -'t) and 
XgCt) exactly correlated. Therefore the amoiuit of B substance entering 
the cleft per second v;as exactly equal to the amount of C substance. 

Thus all the B and C substance v?as used up instantly to produce E as 
is shown by the zero b^ 2 ("t) 3-nd traces. The amount of E produced 

was exactly enough to cause x^(t - ) to exactly correlate with 

on the second response. Again all B and C v:as used up producing E 
and the amount of E produced vras the same as before. 

Since all the B and C substance was used up instantly to produce 
E, we can analytically compute the traces in figure 8,5.1. The response 
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for t 0,1 sec. 



of to an Impulse is: 



Xj(t.) = - ‘’■‘I' 

Allowing for the transmission delay between cell body v and the bonton, 



the b out on meinbrane potential is: 
x^(t -t) = 



for t - 0,h sec. 



The response of to the impulse ^2^^^ 

X 2 (t) = loe" for t - 0,h sec. 
The amount of B entering the synaptic cleft is; 
bi2(t) = C-x^(t -r)J 

similarly, the amount of C entering the cleft is; 
c (t) = [-X (t)3 = 10o(e 



for t - 0,h sec. 



for t — 0,4 sec. 



12"" •- "2 

Since all the B and C entering the cleft is instantly used up 



producing E, 






ors 



,(t) = I 



^ - ocCt-0,43'^ 

10«e 



dt = 10(1 - e 



-«Ut-0,4J' 



12 - • -< 0,4 
which is exactly what figure 8,5*1 shows. 

The second response is due to release of E by the sudden 

increasing leading edge of x^^(t -T), This results in the instantaneous 

release of all E in the cleft as is shoi-m by the z trace. From equation 

8,4,15, the release of the E results in an instantaneous increase in the 

amplitude of x^Ct) to a value of As a = 1,0 and ~ 10, 5» 

XgCt) suddenly jumps to a value of 10,5 as shown. When the sharp 

increasing leading edge of x^(t -'i^) is over, no more transmitter is 

released and the production process begins to produce E substance. 

The amplitudes are the same as in the first response and the same 

amount of E is produced again. As long as the amplitude of the impluse 

215 




i 



4 



exciting V is kept at a value of 10, the same traces will be produced 
for as many excitements of as wc desire. We wni analytically 

prove this statement shortly. . , 

if ve consider the traces x^(t), x^(t - f ) , and to be spikes, 

then the assumption that the input Impulses amplitudes will remain 
constant is realistic. Spikes are always of the same amplitude and 
dux'ation in a pa.rticular species of neurons. Note that once the 
transmitter substance was formed, arrival of the spike x^(t - r) at 
the bouton had the same effect as an input ojiipulso on X 2 (t). Thus we 
may consider that our input impulses, P^(t) and effects 

of spikes arriving at boutons synapsing on and which already 
have 10 molar units of E substance present in their synaptic clefts. 

Figure 8. 5.1 shows the result of the special case of a spike 
arriving at the 1,2 synapse at exactly the same time that is excited 

by an input impulse. To check the ability of those networks to learn 
when the input impulse to a cell body is delivered at a time different 
from the instant that a spike arrives at the synapse, another experiment 
was performed. One cell, V^, vias arranged so that it synapses on 
5 other cell bodies in an outstar arrangement. The parameters in the 
network were kept the same as in the previous experment. Figure 8.5.2 
shows the arrangement of the neurons and the results. 

The amount of B substance in the clefts, b (t), was zero at all 

C j. 

times. This is because the deactivation rate for B, was infonite. 

In the simulation, it was considered that the amount of B entering 

the cleft in an infinitesimal time Interval, dt, v:as made available 

to react vrith any C present to form E, If there was any B left over 

after this reaction, it was immediately deactivated before any more 
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Figure 5.5. 2, A more complex exreriment i.ith the simplistic 

ZI7 neurophysiologic^il mod**!. 
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B entered the cleft in the next infinitesimal tjjne interval. 

Nevertheless, figure 8.5.2 is a good look at the processes going 

on in this model. The c^^(t) and traces show the 3nstantaneous- 

ness of the E production reaction, was excited by an input irapulse 

before the spike from V ari’jved at the boutons. Thus C vias released 

into the c,l synaptic cleft and began to be deactivated. When the 

spike arrived at the c,l bouton, B v:as released into the cleft. Since 

there vias more B being released into the cleft than there was C present 

in the cleft, all the C was instantly used up producing E, Thus 

c . (t) suddenly drops to zero when the spike arrives at the c,l 
cl 

bouton at t = 0.4, However, enough of the C released by had already 
been deactivated when the spike arrived to allow z^j^(t) to rise to 
a value of only 5* 

The traces associated vrith are exactly the same as those associ- 

•4 

ated with in the previous experiment. The spike arrived at the 

c,2 bouton at exactly the same instant that the ^2^^^ input impulse 

was delivered to V^, Thus all the C and B released vas used up producing 

E. 

The traces associated with V„, , and V sho\>r what happened when 

J ^ 5 

the input impulses are delivered to the cell bodies after arrival of 
the spike at the bovitons. Because of the infinite deactivation rate 
for B, there vas no accmulation of B in the cleft. Thus only the araovmt 
of B entering the cleft when these cell bodies were escited is available 
for reaction with C to form E. Remember that the B entering the 
cleft is: 



2/8 



and the C entering the cleft is; 
c^^(t) =■■ [ -x^(t) ] 

V was excited by the input impulse P^(t) at time t = 0,5. The spike 

3 

arrived at the bouton at t = 0 , 4 , Because of the infinite deactivation 
rate for B, all the B vjhich entered the cleft before t = 0,5 was 
deactivated instantly, Tims the B available for reaction vrith the C 
which begins to enter the cleft after t = 0,5 is; 

b^^(t) = [--x^(t - f ) ] *^ = 10<^e **^*^e for t — 0,5 

The C which. enters the cleft after t = 0,5 is; 



c ^(t) = t“X (t)3 ^ = 10c<[e °*tt-0,5l ^ 

c3 3 



sec. 



0,lc< 



ThuSj the amount of C entering the cleft is a factor of e 
greater than the B entering the cleft. The reaction 1*B + 1*C^ 1-E 
is instantaneous and the coeficionts of unity mean that [. min(b (t), 

c3 

• -1 + 

c Jt)] ■ of B is converted to E immediately upon entering the 

4 

cleft. Since b ^(t) is less than c „(t), all of the B is converted 

c3 c3 , 

to E, Knowing this, we can analytically compute the amount of E 
produced; 

z^^(t) = ^^min(b^^(t) , 

This last conclusion is a technical point. Since all the B entering 
the cleft is immediately used up, there can be no accunmlation of B 
and b .(t) is technically zero. However, in an infinitesimal time 

c3 

interval, dt, b^^(t)dt of B did enter the cleft, V/e must hypothesize 

an infinitesiiaal accumulation of B in the cleft of; 

db (t) = b *(t)dt 

c3 c3 

Since db „(t) < c (t) at all times, the amount of E produced during 

c3 c3 

the time interval dt is; 



dz^^(t) = db^^(t) = b^^(t)dt 



2 /^ 



Thus z (t) = b -(t), 

c3 c 3 

The E produced at any tajr.e t > 0,5 is: 



o3 



dt 



■'0.5 



= lOe* 






For times sufficiently greater than t = 0,5» the E produced is: 

, . ^ -0,lc«> 

z^^(t >> 0,5) = lOe 

for c< = 3»3333» this gives us: 

z^^(t >> 0.5) = 7.2 

which agrees very well with the experimental reaults shown on the 
Zc 3 (t) traces in figure 8.5.2, 

Since there was more C than B entering the c,3 cleft, and since 
C was deactivated at a finite rate, there is an accvunulation of C in 

the cleft. The c^-^(t) trace shows this accumulation and its deactivation. 

» 

The traces associated vrith and are similar to those associated 
with Yy The only difference is that and were excited by input 
impulses at progressively later times than Yy 

The second response shown on all the "traces is a "prediction” 
response. The command cell body, V^, was excited by an input impulse 
alone. The spike so generated traveled dov:n the axons to the "grid" 
cell bodies, through V^, When it arrived, it instantly released all 
the transmitter E substance in the synaptic cleft. Each of the "grid" 
cell bodies \-7as excited to a membrane potential of z .(t = 2,2), In 

C'J. 

this case, there was no time difference between the arrival of the 
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spike at the boutons and tho excitement of the grid cell bodies. 



Both evonts occurcd at t — 2.2. Thus the amount of B being released 

into the clefts which could react with C to fem E vjas: 

. + -c.Ct-2.2 3 ^ ^ o o 

b (t) = [-X (t“'Z^)3 =10«e for t — 2.2 

ci 0 . 

However, the amount of C being released into the clefts at the same 
time was: 



• (t) = C-i.(t)] *^ = az .(t = 2.2)«e"''^^"^*^^ for t ^ 2.2 

ci ^ ci 

In all cases, the amount of C being released was less than or equal to 
the amount of B being released. Thus: 

Zci(t) = = 2.2) « for 2.2 

or: 



ca 



.(t) = =-• 2.2) dt = 

(t = 2.2K1 - 



az 



Cl' 



for t sufficiently greater than t = 2,2: 

2t2) ~ *” 2 « 2 ^ 

which is what the z .(t) traces in figure 8.5.2 show. Note that the 

c 1 

effect of a prediction excitement of the grid cell bodies is to produce 
the exact amount of E after excitement as there was before the excitement. 



In this sense, the network is self-sustain'ing. We can continue to excite 
the grid cell bodies with prediction spikes for as long as we want. 

The result will be the same as the prediction response shovm. 

Because the amount of B beijig released vras always greater than or 
equal to the sjnount of C being released during the prediction excitement, 
there is no accumtilation of C in the clefts. The c^^(t) traces are 
therefore zero during the prediction excitement. 
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section 8c 6 



Inhibition and an X ^ Logic 

VJe now have a simplistic model of a nervous system that is a 
synthesis of some neurophysiological facts* some assumptions, and 
embedding field theory. Although much thought went into the modeling 
process, we can not pretend the model is accurate. The fact that the 
model does work is a powerful! arg\3ment for a deeper study of the 
embedding field theoretical assumptions concerning leaming at the 
microscopic level in living organisms. 

The time was not available for that deeper study. Shortly, 
we will drop the neurophysiological names that have been attached 
to the elements and processes in this model and consider it to be 
an embedding field network only. Before we do so, there is one further 
neurophysiological phenomena which occurs in nei'vou.s systems. At the 
microscopic level, inhibition consists of depressing the cell body 
membrane potential below the resting potential, Figu.re 8,6,1 shows 
a common inhibiting arrangement in the spinal column of vertebrates. 

The two large light neurons are motoneurons. The dark neuron is a 
Renshaw cell. The sequence of events shotm on the traces is as follows 
The cell body of motoneuron is excited by a spike. Its membrane 
potential rises with an EPSP and a reflected spike. This spike is 
propagated doim V^*s axon, A collateral breaks off of this axon and 
synapses on the Renshaw cell's body. Arrival of the spike at this 
synapse excited the Renshaw cell body which fires a burst of spikes. 
Those spikes propagate up the Renshaw cell's axon. The Renshaw cell's 
axon breaks up into two collaterals. One synapses on the cell body 
and another synapses on the ^2 cell body. I'/hen the burst of spikes 
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Fif'ure ^ corj"'n •nhibitrry in tV.c 

Erin-1 :oli;rn of . Tl-.s d-nrk nnur ;n ic a 

RenaF.aw cell. Th» lifiit neurons are r.otoneuror.s. 
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arrives at these synapses, inhibitory transmitter is released. The 
inhibitory transmitter causes a decrease in the membra-ne potentials of 
and below resting potential. The membrane potential traces which 
are below resting potential are called inhibitory post synaptic potentials 
or IPSP's, 

The important things we want to note from figure 8,6,1 are: 

(a) The Renshaw cell's body membrane potential increases in the 
positive direction when it is excited, 

(b) The spikes propagated along the Renshaw cell's axon are 
similar to the spikes along the motoneuron's axons. In partictilar 
they are increases in the positive direction of the axon's membrane 
potential, 

(c) A transmitter substance is releasee by these spikes. It 
causes a decrease in the motoneuron's coll body membrane potential. 

This decrease in membrane potential ‘does not cause any change in the 
motoneuron's axon membrane potentials. 

These facts show that thei'e is no negative membrane potential 
propagated anywhere in the system. All propagating signals are positive 
signals. In the discussion of allowable prediction signal states in 
section 7,2, we did not allow the propagation of negative amplitude 
prediction signals. We made this restriction on the grounds of 
consistency and the fact that negative amplitude prediction signals 
were not needed in an outstar. In the nervous system of living 
organisms, negative amplitude "prediction" signals do not occur. Thus 
our restriction on the allowable states of prediction signals in embedding 
field netTTOrks is consistent with neurophysiological data. 
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The inhibitory transmitter substance released by the Renshaw 
cell's burst of spikes is considered to be a chemical substance that 
is different from the excitory transmitter which excites the moto- 
neuron's cell bodies. There are at least three chemical substances 
which act as transmitter in nervous systems. They are acetycholine, 
epinephrine, and norepinephrine. In one part of the body, a.nd with one 
species of neuron, one of the substances may act as an excitory 
transmitter and another may act as an inhibitory transmitter. In 
another paid:, of the body and with another species of neuron, their 
effects may reverse. 

With these few facts in mind, v/e will now invent a simplistic 
model for inhibition which we shall add to our previous model. Firstly, 
we will postulate an inhibitory transmitter substance H which is 
different from our excitory transmitter substance E, Since the H and 
E may reverse their roles in other parts of the nervous system, vre 
want the processes for production and release of H to be similar to 
those for E, Therefore we will assume that H substance is stored in 
the synaptic cleft. It is released when the adjacent bouton membrane 
potential is increasing, i,e,, when x(t -x) > 0, \'le will further 
assTJune that the release of one mole of H will result in an instantaneous 
increase in the adjacent cell body membrane potential of "6 volts. 

Note that this is an increase of ^ volts. We have specified that the 
release of one mole of E vrUl result in an instantaneous cell body 
membrane ptoential increase of a volts. By specifying a or ^ positive 
or negative, we can specify their effects in variolas parts of our 
system. However, normally y will be assigned a negative value. 
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V/e mvist noT7 invent a process which viill produce H substance from 
chemical substances available in the synaptic cleft. To do this we 
will look closely at the Renshavr cell bouton-motoneuron synapse in 
figure 8,6,1, The effect of H substance is a decrease in the moto- 
neuron's cell body membrane potential. This is caused by an increase 
in the cell body membrane's permeability to K+ and Cl- ions. With 
the sodium pump vjorking to eject Na+, the net effect is an increase 
of K+ ions inside the cell body. Remember that we allowed C substance 
to be released when the cell body membrane potential was above 
resting potential, but decreasing. When the cell body membrane potential 
is above resting potential but decreasing, K+ ions are diffusing 
out of the cell body. Thus we have sort of tied the release of C 
substance to the diffusion of K+ ions out of the cell body. Now, 
when the cell body's membrane potential is decreasing below resting 
potential, K+ ions are diffusing into the cell body. Thus we may 

4 

ass\3me that no C substance is being released into the synaptic cleft 
when the cell body membrane potential is decreasing belot'J rest potential. 
We will make the further assumption that no C substance is released 
at any time when the cell body's membrane potential is below resting 
potential. Thus C substance can not be involved in the production of 
H substance. We could postulate another chemical substance which 
is released from the cell body into the synaptic cleft vihen the cell 
body membrane potential is belo%^ resting potential. This is a valid 
option, but we irTill not investigate it further. 

Since the Renshaw cells' spikes are the same as all other spikes, 

B substance is being released from the Renshaw cells' boutons. Thus 

B substance could be a reactant in the production of H, Suppose that 
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there is a substance, S, which is always present in large quantities 
in the synapuac clert. Suppose further thctt a substance reacts X'rith 
B substance according to: 

8,6.1 1-B + l-S ^ 1-H 

Suppose fui’ther that this reaction is fast, but not as fast as 
the reaction producing E substance. Then excitation of a bouton with 
a spike will release B substance. If there is C substance pr^'sent 5n 
the cleft, then £riin(b(t), cCt))!"*" of E substance v/ill be produced. 
If there is any B left over after this reaction, it will combine 
with S to foirni H, In the experiments of section 8,5 we saw that an 
accximulation of B in the cleft is not necessary for learning. (The 
accxmiulation of B in those experiments vras always zero because the 
deactivation rate for B, vx^, was infinite.) Further, if wo make this 
postualtion, then the logic goveming the performance of the elements 
in the notvxork will be an . logic, 

3 

An eC^ logic conforms to the follot'Xing tabulation: 

Table 8.6.1 



oa 



X 

c 


X. 

1 


o 

IX 


0 


0 


0 


0 


+1 


0 


+1 


0 


-1 


+1 


+1 


+1 


+1 


-1 


-1 


0 


-1 


0 



In the current context, this tabulation means that there is no 
transmitter substance, E or H, produced vxhen the bouton membrane potential 
is at resting potential, or x^ = 0, This is independent of whatever 
the adjacent cell body membrane potential may be. However, when the 
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bouton merabrano potential is above resting potential, there are three 
cases: V/lien the adjacent cell body membrane potential is at or below 

resting potential, inhibitory H transmitter substnace is formed. If 
the adjacent cell body membrane potential is avove resting potential, 
but decreasing, then excitory E substance is formed. 

Thus the reaction 1*B + 1* S ^ 1-H accomplishes one of the stated 
aims of this chapter ~ the implementation of an logic. Wo will 

therefore adopt it as the chemical reaction producing H substance 
in the model. 

The alert reader may have noticed that we have already accomplished 
the third aim of this chapter. V/e have already invented a process which 
does not cause "pulse lengthening" of a signal being transmitted through 
a neuron or embedding field node cascade. Consider a cascade of N 
neurons, ^j i synapses on the neuron. The 

neuron is the "starting" neuron,* Stippose th-at each of the j-l,j 

4 

synaptic clefts in the cascade contains A moles of E substance. Let 
the E effectiveness factor, a, be a = 1,0, For simplicity let the H 
effectiveness factor, , be ^ = 0, so that we do not have to TJorry 
about inhibition. Let the "starting" neuron, , be excited by an input 
impulse of amplitude A at tixae t^. Then: 

0 f or t < t. 



x^(t) = 



Ae” ^ for t - t. 



This signal 17111 arrive at the 1,2 synapse at t^^ + T , It will cause 
the release of all the E substance present. Thus: 



X2(t) - 



0 for t < t^ + T 
Ae"'^'*^ for t ± t^ t r 
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Since ^2^^^ are identical, A moles of E vrill be 

produced in ihe 1,2 cleft by the E production process after t = tj^+T, 

The same argument holds for each pair of neurons, the 

cascade. Thus: 

0 for t < t^ + (j-i)ir 
Ae"'^^ for t =? t , + (j-l)r 
Except for the time delay, (j-l)T , the signal is transmitted thxX)U.gh 
the cascade unchanged. There is no "pulse lengthening". Additionally, 
the self-sustaining property of the E production insures that we can 
propagate any number of signals through the cascade without distortion, 
(Note, this last statement is true only if there is a time interval 
between consecu,tive signals which is large enough to allow the S pro- 
duction process to produce approxiraatoly A moles of E before the next 
signal is started at the "starting" neuron. In practice, making this 
ii'.terval 3/ot seconds is svifficient,* ) 

♦ 

The reason that such a cascade does not distort a signal is simple. 

The input signal to the "starting" neuron is an iraplxise. The "prediction" 

input signal to all the cell bodies in the cascade is also an impulse. 

This is because the effect of the release of A moles of E in a synaptic 

cleft is an instantaneous increase of the adjacent cell body membrane 

potential of A volts. The effect of an input impulse is an instantaneous 

increase of the cell body membrane potential of A volts, - Thus the 

effect of an input impulse and a "prediction" excitement are the same. 

Having modeled an arbitrary mochaniaii, vre will now drop the 

netirophysiological names assigned to the elements and processes of the 

nodes and replacd them with embedding field names. To do so, we mxist 

add a "synaptic cleft" between the arrowheads of the embedding field 
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X (t) = 



theory and the adjacent node. This is added to give us a definite 



place for tlie chemical reactions we have ii'.vcntcd to occur. We will 



denote the synaptic cleft between the N.. arrowhead and the V. node 

.31 1 

by S.; , Because cur model works according to chemical reactions, we 
will call a network composed of elements from this model a chemiical 
embedding field network. We list here a complete description of the 
processes. There are several new variables in the follovring eq'jations. 
They are defined after the equations. 

Equations for the chemical embedding field network processes: 

8 , 6,2 X (t) = -«x.(t) + P.(t) •!- aSR(p.(t -'T))z..(t) t 
i 1 a j ^3 31 

y?R(p,(t -r))h..(t) 

where R(p.(t -r))y(t) is a. special fvinction defined by: 
d 

foifp(t-r) - 0 
R(p.(t -r))y(t) =] 

^ l^an impulse of amp 3 .itude y(t) when p ) > 0 



6.6.3 

8 . 6.4 
where : 



P.(t -r) = [x.(t -r)] 

3 3 

Zji(t) = [min(b^^(t) , -'T))2.j^(t) 



^ Tx if X - y and x > 0 
[min(x, y)] =-|yify± x and y > 0 
1^0 if X < 0 or y < O' 



8.6.5 

8 . 6.6 

where : 



+n+ 



h ..(t) = [b,,(t) - lmin(b..(t), c..(t)]‘] - R(p .(t- t:) )h . . (t) 
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31 



31 



b,.(t) = [-p.(t --r)] - [min(b..(t), c..(t))] 

31 3 31 31 
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[yf =■ 



y if y > 0 

(^0 if y t 0 



8.6.7 "c.-Ct) = [-[x.(t)]*^] - w f c (t) - b..(t)] - 

J-*- *'1 c 31 3i 



3i 

[rain(bjj^(t) , Cji(t)] 
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Definition of the variables: 



x.(t) is the conventional x process which occurs at node V., 

p.(t - T ) is the prediction signal at the arrowheads connected 
0 

to node V. by directed edges. Since we do not allow negative amplitudes 
0 

for prediction signals, p.(t-T)=[x.(t-t)3 • Only the first 

derivative, p..(t ~ T ) is vised in the above equations, 

P^(t) is the conventional event input impulse. In this study of 
the chemical embedding field neWorks, P^(t) will be constrained to 
be an impulse of amplitude A, 

z..(t) is the amount of excitory transmitter substance, E, in the 
S... synaptic cleft, 

h..(t) is the amount of inhibitory transmitter substance, H, in 

the S .. synaptic cleft, 

J u 

b,.(t) is the amount of B substance in the S.. synaptic cleft, 
ji Ji 

^ji(t) amount of C subst'ance in the synaptic cleft. 



Definition of the constants; 

(X is the decay rate for x processes, 

a is the effectiveness factor for E substance. Release of 1 unit 
of E in the synaptic cleft will result in an instantaneous increase 
in the amplitude of the adjacent nodes' x process of a, 

^ is the effectiveness factor for H substance. Release of 1 unit 
of H in the synaptic cleft will resvilt in an instantaneous increase in 
the amplitude of the adjacent nodes* x process of » if will have 
negative values throughout the rest of this study. 



is the transmission delay due to finite transmission velocities 
on directed edges, A signal which originates at the node at time t^ 

23 ( 



wHl arrive at the arrowheads connected to this node at time t + T . 

i 

•vr is tliO rate constant for deactivation of C snbstance, 
c 

Discussion: 

Equations 8,6,2 through 8 , 6,7 a mathematical description of 
the processes we have invented in this chapter. They are different 
from equations 8,^,15 through 8,4,18 because they include the addition 
of the inhibitory processes. 

The functions aSR(p(t -•r))z...(t) and JJ^R(p.(t -f))h..(t) 

J J J 

in equation 8,6,2 say that when the prediction signal p.(t -T) 

3 

arriving at the N.. arrox^rhead is increasing, all the E and H substances 

in the synaptic cleft is released instantly. The release of these 

substances at time t^ causes an instant increase in the amplitude of 

the adjacent x.(t) process of az..(t^) *!• 5^ h . . (t _), 
a ^ ja 0 Ji 0 

E substance is produced in the synaptic cleft according to the 
instantaneous reaction: , 

1*B + 1*C ^ 1*E 

Because the tinit coeficients in this equation, the raaximtun amount of E 
that can be produced at any tjme is the rainiraurd of the reactants 
available. Equation 8,6,6 says that the amount of B being released 
into the S.. cleft per second is [-p.(t , That is, the amount 

J 

of B being released from the N. . arrowhead into the S cleft is 

ji 

directly proportional to the decrease per second in the amplitude of 
the prediction signal at the N.. arrowhead. The B substance thus released 
is first made available for reaction vrith C to form E, If there is 
any B left over after this reaction, it reacts with S substance to fom 
H substance, S substance is always present in large quantities in the 
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cleft. Equation 8,6,5 says this matheraaticalD,y, 

Tho amoxmt of C released into the S_ cleft per second is directly 
proportional to the decrease per second in the anplitudo of the adjacent 
V. nodes' x process, pi'ovided that x process is positive. The- terra 
[- J in eqviation 8,6,7 says this. The amount of C present 

in the cleft is first made available to react with any B present to 
form E, If there is any C left over after this reaction, it is 
deactivated at rate w^. Equation 8 , 6 .? states this mathematically. 
Although equations 8,6,3 through 8,6,7 are complicated and 
describe a complicated set of sinultaneous processes, they are fairly 
straight for\xard to simvCLate on a digital computer. In the next 
section, we shall study an outstar network governed by these equations. 
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section 8,7 



A Chemical Out star 



An outstar composed of chemical embedding field elements v:as 
set up. The standard experiment that has been performed in the 'other 
outstai's studied was performed. The events inputs to the nodes were 
specified to be irapulses of amplitude A = 10, From equations 8,6,2 
through 8. 6, 7 1 there are five network parameters to be specified: 
c* , T , a, ^ , and w^, and T were specified as in the past: 



= 3.333 sec; 
= 0,3 sec 



-1 



The deactivation rate for C substance, w , was arbitrarily specified 

c 



to be: 



w = 0,5 sec, 
c 



-1 



Since an excitory transmitter (E) substance effectiveness fa.ctor 
of a = 1,0 has resulted in self-sustaining systems in the past, a was 
specified to be: . 

a = 1,0 

The specification of the new inhibitory transmitter (H) substance 
effectiveness factor, ^ , will require some discussion. The chemical 
outstar confoims to logic tabulated in table 8,6,1, The three 
assignraents in that table which can cause the "z" processes to be 
driven to non ambient states are: 

8.7.1 

8 . 7.2 

8.7.3 

In the current context, 8,7.1 says that an excited prediction signal 
at an arrox'jhead and an excited x process at the adjacent node results 

25 ^ 



^ ^(x^ = 'i'l , X^ = +1 ) = +1 
S! ^(x^ = +1, X. = 0) = -1 
-i(x = +1, X. = -1) = -1 



in the production of E substance. This is equivalent to driving a 
''z*' process in the excitory direction. Tlit; other tiro assignments say 
that \'fhen the prediction sig-nal is in an excited state and the adjacent 
node X process is isi an ambient or inhibited state, H substance vrtll be 
produced. This is equivalent to driving a conventional "z" process 
in the inhibitory direction. In the last chapter, we introduced the 
idea that an outstar may have its grid nad conunand nodes randomly 
excited before it "goes to school" to learn a pattern. According to 
the logic, random excitement of the grid nodes can not change the 
"z" process state. However, random excitement of the coiimiand node 
can result in the outstar learning to directly inhibit all the grid 
nodes according to assignments 8.7,2 and 8,7.3« In s- real environment 
W8 can expect this to be the case before the outstar "goes to school". 
Thus the outstar wall be inhibitorally biased before we try to teach 
it a pattern, V7e must insure that this inhibitory biasing is not so 
great as to prevent the outstar from learning a pattern. 

To faciD.itate this discussion, we will prove the following lemma: 
Lemma 8,7*1 

Let a node have an arrowhead Iwipinging on another node V^. 

Let the fundtions ~ ~ node be excited by a 

positive impvilse of amplitude at time t^ , Let node be excited at 

time tg = tj^ *'r'T by an input impulse of amplitude A^ which may be 

negative. Then the amount of H substance in the synaptic cleft 

at times t t. + = t is: 

1 2 

r 0 if 0 < A < A 
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And the ainonnt of E substance in the synaptic cleft at times 



t >> t. + T = t_ is; 
1 2 






Ai if 0 < < A2 

A2 if 0 < A2 ^ Aj^ 
^0 j f A2 - 0 



Proof : 

The input impulse to node V^, results in a prediction signal arriving 
at the arrov;head at time t^^ + 't' ~ which is; 



P^(t) = 



0 for t < t^ 



The input impulse to node 



for t - t2 

results in an X2(t) process; 



X2(t) =■ 



0 for t < tg 

A20~'*^^ for t - t^ 



The amount of B substance released into is; 



bi2(t) = L-Pi2^^^3 = 



0 for t < t^ ^ 

A^o( for t ± tp 



The araorint of C substance reseased into S is; 

12 

. • + r 0 for t < t2 

ci?(t) =[-[x,(t)] + ] =-^0fort^ to if A 2 - 0 

2 J J [a 2 « e- for t e t2 

Thus the ajnoimt of E being foined is; 

~ [minCb^g^b). ^ J 

'0 for t± tp if = 0 

= i A2(1 - e" '‘'■'’^”'^2^ for t ± t2 if 0 < A^ < A^ 
[a ^(1 - ‘ for t t2 if 0 < A^ < Ap 



2 











I 

I 




Thus for t ■^2* 



>■> t^) = 



’O if Ag - 0 

if 0 < < A^ 

if 0 < A^ < A^ 



The amoxmt of H being formed is equal to the araount of B left over 
after the E production reaction, 

^2^^^ “It - [riinCb^^Ct), c’^^Ct))]*' ]. dt 



t2 12 
if < 

=(A^ - A2)(1 - *^) if 0 < A < A^ for t t^ 



0 if A. < A for all t 
1 2 



A^(l - if 



A^- 0 f or t t. 



Thus for t > > tgt 






hi^Ct »t2> = 



0 if A^ < Ag 

A^ - Ag if 0 < A^ < A^ 

A^ ifAgt 0 



Note that by lemma 8.7.1* ^ ^12^^ ^ ^2^ “ 

note that immediately after arrival of a prediction Signal at the 
arrowhead * hi,(t) = ^12^^^ “ lemma 8,7*1 applies to the situations 

where there is H and/or E substance present in before arrival of 
a prediction signal. Since arrival of a prediction signal causes 
the equivalent of input impulses of amplitudes az^^Ct) and ‘J^h^^Ct) 
to be delivered to V^, this lemma can be used in all cs.ses by setting 
Ag = az^^(t) ^ **'" ■^i’ -^i 1® I"!*® amplitude of an ex'temal 

input impulse* if any. 

Now, suppose \7Q start our outstar in a state of initial ignorance. 

That is, z . (O) = h .(O) =0, Let 0 > ^ > -1, We then excite the 

Cl Cl 

cormnand node vriLth an input irapixLse of amplitude A vrithout exciting the 
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grid nodes* By lemina 8*7tl» h ,(t) - A mid z ,(t) - 0, Suppose we 
excite the coLimand node again without exciting the grid nodes. V/hen 
the prediction signal arrives at the arrowheads | all the transiniteer 
substance is released. Thus the grid nodes are excited by iinpulses 
of amplitude^ A < 0. Then by lemma 8.7.1, A units of H v/ill be produced 
in the synaptic clefts, Thus, before the outstar "goes to school", 

the synaptic clefts contain A units of H and 0 units of E. 

NoX'J let the outstar "go to school". The command node and the grid 
nodes are excited with input impulses of amplituae A. , ihe comniand 
prediction signal will cause a further impulse of amplitude "iA <. 0 
to excite each of the grid nodes. Thus the grid nodes will be excited 
by a total input of A(1 ). Thus A(l + S' ) of E will be produced. 

(Remember that 0 > ^ > -1) 

Suppose we want the outstar to be able to directly inhibit a single 
occurance of a random mistake. To the outscar, the first presentation 
of the pattern after going to school is 'considered a random mistake. 

Thus we want as much of H produced as E, On this criteria, "6 —0,5 

is specified. Now, let us present the pattern a second time. The total 
input impulse amplitude to the grid nodes in the pattern vrill be 
A( 1 +■)$)+ AX + A = 0.5A - 0.5A + A = A. Thus A units of E will be 
produced on the second presentation of the pattern, 0 units of H will 
be produced. 

Thus by specifying ~ “■0.5, wall have an outstar that is 
resistant to single occurances of random mistakes, but will learn 
a pattern v:ell in tvjo presentations. Therefore, for the experiment, 

^ is specified to be: 

^ = -0.5 
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Figure S.7.I. Teaching a chemical outstar the pattern’ 
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Figure 8,7.1 shows the results of the first part of the experiment. 
The command node is excited once alone at tlie boginning of the 
experiment. The z traces show that no E i^as produced in the synaptic 
clefts. The h traces show that 10 units of H was produced in the clefts. 
Thus the outstar is inhibitorally biased before "going to school". 
"School" begins with the second command node excitement. Event 2 is 
presented exactly when the command prediction signal arrives at the 
arroxjheads. Event 3 is presented 2/<^ =0,6 seconds later. The 
pattern is presented tvri.ee. 

In both presentations of the p3,ttemt significantly more H is 
produced in the S cleft than E, Since event 1 is not presented | 

10 units of H are produced in the S^^^ cleft. No E is produced in the 
S^j^ cleft. On the first presentation of the pattern, the araovAnt of 
E and H produced in the S^^ cleft approximately balance. On the second 
presentation of the pattern, 10 units of E are produced an the S^^ 
cleft and no H is produced. 

The fourth excitement of the comraand nodes results in a prediction 
excitement of the grid. The third response on the grid x traces in 
this prediction excitement of the grid. From the results \jq can conclude 
that the outstar has learned the pattern ^ Vg* also learned 

to directly inhibit grid nodes and V^, 

The experiment was continued to test the random mistake in the 

previously learned pattein 8,7,2 shows the results. 

The direct inhibition of vjhich the outstar had previously learned 

caused Xj^(t) to rise to a value of only 5, (The input impulse, Pj^(t) 

has an araplitvide of 10.) The amounts of H and E produced in the 

cleft approximately balance. Thus v?hen a prediction is excited by the 
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Figure S.7.2, Resistance to ranciom mistakes in a chemical outstar. 
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Figure 8,7.3* An unsucessful attem^^t to correct a previously learned 
pattern in a chemical outstar. 



second excitewent of the command nodOf x^(t) rises to only a slight 
positive value t The amount of H prodxiccd in during the prediction 
excitement is considerably more than the E prodxxced. Further prediction 
excitements will result in inhibited amplitudes for x^(t), V/e- may 
conclude that the outstar has good resistnace to random mistakes. 

The experiment vjas contin\ied to test the correctability of the 
outstar. The correcting pattern — vVg was presented twice, Figvire 
8,7,3 shows the results. Although the outstar did loam the pattern 
V^— *“V^, it did not "unlearn" the previously learned pattern 
There is no "forgetting rate" in the chemical outstar. Thus the old 
pattern can not be forgotten. There is also no latei’al anhibition 
in this outsta,r. Thus, this chemical outstar lacks the two mechanisms 
whereby previously learned patterns can be removed froiii its memory. 

This is a major drawback in this outstar. Further vjork Trrlth it would 

t 

require investigations of the effects of a finite forgetting rate for 

* 

the E and H substances in the synaptic clefts. Additionally, the effects 
of lateral inhibition sho’J-d be investigated. 
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APPENDIX A 



The Digital Simulation and its Accujcacy 

The equations which were simulated in this thesis were siraul- 
taneous nonlinear differential difference equations. They fell into 
three basic types! 

A.l x(t) = - dx(t) + Ix(t) 

A, 2 ^t) = -ocy(t) + z(t)x(t - t) + Iy(t) 

A. 3 z(t) = -uz(t) + y(t)x(t -'C) 

Figure A.l shows a system flow diagrain for this sot of equations. 

The key to the digital simulation is the algorithm used for the 
integi-ators. This thesis used a simple Euler inile algorithm. That 
is, the integral! 



was simulated by the algebraic equation; 

r(t + h) = r(t) r(t)h . 

whore h is the digital increment. 

The Euler laile algorithm was adopted because it is easy to program 
on a high speed difital computer and the computations require compara- 
tively little computations. The large number of experiments simulated 
in this thesis required efficient use of computation time. Most of the 
experiments involved at least seven variables and required over fifty 
increments. Thus the simplest and fastest integration algorithm was 




selected. 



The sampled data "z" transforms for the equations 
A. 4 x(t) = -o(x(t) + Ix(t) 



using an Euler mile integration algoritm is s’ 




Figufe A,l, A signal flo’./ diagraia for ohc 



siriulation. 




X (z) =(h/(z “1 + ha )) I (z) 



for: 



Ix(t) = u_^(t) 
vrhere J 

0 for t < 0 



(1 - l/(ho^))z 

) 

z - 1 ha. 

The time varying fvmction which this transforms to is: 

x^(t) = (l/(a)u_j(t - h) + (h - for tf 0 

vjhere : 



u_^(t) =■ 



1 for t - 0 



■A 

X (z) is: 



X (z; = 



h z l/hck 

-( 1 - 

Z Z “ 1 



y = -l/(h)ln(l - ha ) 



The continuous solution to A, 4 when Ix(t) = u ^(t) is: 
x(t) Kl/a)(l - e for t - 0 

l 

For t ~ h, the ratio: . 



A.5 



x*(t) 






= 1 + 



for t ± h 



X (t - h) (1 - e" 

computed at (t - h) = l/ft vjas used to check the accuracy of the amplitudes 
of the digitally simulated f\anction x^(t). The ratio Vft was used to 
check the accuracy of the simulated decay rate,^ , The two most 
frequently used choices for OC and h in this study were: 



a = 3.3333, h = 0.1 

and: 

ft = 1,6666, h = 0,1 



J 





The follo’i'Ting table shows the accuracy of the shji'ulation to a 



step input; 







x^(t) 


c(.he 


o^.h 


Va = -d/ah)ln(l-c(h); 


- 1 + 






x(t) 


(1 


0,3333 


1,170 


1,163 




0,166666 


1,097 


1,087 





-Tf(t-h) 



-«(t~h) 



Since all of the input pulces used in the study vrero of duration 
S = 1/a , the response of the x processes to input impulses is in error 
by at most 175 ^. The simulated decay rates are in error by at most 
17^ alsot 

No attempt v?as made to analytically compute the error in the 
simulated response of the x and z processes to non linear inputs. 

The results viere self-consistent and agreed qualitatively >ri.th 
Grossberg's theoretical predictions. Throughout this study a 
qualitative feel for the networks studied and the parameters involved 
in them iias the primary concern. As long as the simulation agreed 
qualitatively with theoretical expectations, little concern vjas given 
to the possibi].ity of up to amplitude errors In the computations. 
The comptitations order and actual equations used to simulate 
equations A,i through A , 3 were: 

A, 5 x(t + h) = x(t) •'r (Ijj(t) -«x(t))h 

A , 6 y(t + h) = y(t) + (Iy(t) -«y(t) + z(t)x(t -r))h 

A , 7 z(t + h) = z(t) t (y(t + h)x(t + h -t'))h 

where t = hn; where n is an integer, 

^ was always chosen to be an integer multiple of h. The sequence 
A, 5 t A, 6 , A , 7 was computed and then started again irf.th A , 5 for the next 
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incrementation. Thus the values for z(t) in A. 6 vove effectively 
delayed by Ii, 

The digital computer used for the simulations reported was 
Digital Equipment Corporation PDP/9 with32K of core memory. The 
programs used wore programed in the Digital Equipment Corporation's 
interpretive language FOCAL, The choice to use FOCAL was made bocaii.se 
FOCAL allows the dimensions of matrices to be a variable that can be 
specified at run time. The programs used stored the value of each of 
the variables being computed after each incrementation. The stored 
values xTere outputed at the end of each run. Since the number of 
variables and the number of incrementations per run varied consider- 
ably, the ability to specify matrix dimensions in the programs 
immediately before the ni.n was a great advantage. 

The minimum accuracy in calculations perfomed by FOCAL is six 

digits. Since the sampled data error was on the order of six 

« 

digits computation error v;as entirely sufficient. 
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